Jun 272012
 

This is a followup to the final section of A Rubyist has Some Difficulties with Go called “Limits to Polymorphism and Method Dispatch in Go”.

I believe I have developed an understanding for what functionality here Go is providing. I’ve written another little test program, executable version is here and the gist:

What does the use of A in the following definition mean?

type A struct{}
type B struct {
  A
}

It’s described in various ways, such as “struct A is embedded in B” or “A is an anonymous member of B”, as an example of “composition rather than inheritance”. Not having a useful mental model of what it means has obviously been the source of my confusion (even I knew that much :-)

The turning point for me was the realisation that the following two lines are essentially identical:

v.SomeAMethod()
v.A.SomeAMethod()

Since they are essentially the same, then it shouldn’t be any surprise when the implementation of SomeAMethod doesn’t know anything about v. Without know that they are the same, I was thinking that the receiver of had to be v and since it wasn’t was at a bit of a loss. That what appears to OO-centric me as a loss of information about the receiver of the method call is really a misunderstanding, by me, of what the receiver actually is. It’s not v, it’s v.A.

So it seems that v.SomeAMethod() is a syntactic convenience provided by Go for the programmer. And maybe a little more.

So this leaves polymorphism in Go as pretty much entirely provided by interfaces. Interfaces consist of a set of named methods. It is not possible to explicitly state that a type satisfies an interface, the compiler works that out. The convenience is extended a little by applying the syntactic trick of dropping the member name for anonymous members when deciding what methods are implemented by the type. And so SomeAMethod is a method offered by B because it’s offered by A.

This kind of thing is often referred to as delegation. In this case, B delegates the SomeAMethod to it’s anonymous A. The trick with delegation is what is the receiver (i.e. self) in the method. In Go the receiver is the anonymous A instance of the B instance v. If, instead, the receiver was v, then we’d have the delegation found in programming languages like self and even JavaScript. And that kind of delegation is equivalent to inheritance. The important difference is where methods called within the delegated method are looked for.

 Posted by at 3:23 pm
Jun 262012
 

Update: I’ve posted a followup Followup to A Rubyist has Some Difficulties with Go: Limits to Polymorphism and Method Dispatch in Go that addresses my current thoughts about the last point in this article.

Update: There’s a thread discussing this post on the Golang-nuts mailing list and I hope that pointer works for you.

Update: Steven Degutis has posted an article Ruby Accessors Considered Pernicious in response to part of this post and the thread on the golang group. He makes some reasonable statements, but as usual, I’ll take the position that while the usual situation shouldn’t play accessor tricks, it’s incredibly valuable for not-so-usual situations.

I’ve been using Golang for perhaps 18 months now, happily and successfully. I’ve implemented quite a number of smallish projects ranging from a couple of hundred lines to maybe a couple of thousand, as well a three or four larger projects of maybe around 5k lines each. Go was a very good fit for these, in some cases a spectacularly good fit.

Over the same period of time I’ve used Clojure and, especially, Ruby to implement several much larger projects. I’m now looking at a new project that I’d like to write in Go, one that I’d have, unhesitatingly, until now, writen in Ruby. I imagine I’m one of the unexpected migrants that Rob Pike talks about in his article from yesterday Less is exponentially more. Part of my decision process has been to prototype fragments of the solution in Go. This is where these difficulties arose. I’ve got ugly workarounds to all but the last issue, so it’s not like these are show stoppers (well maybe the last one).

Lets be clear here, I have a very high regard for Go. I intend to use it many times in the future. I recommend people consider it seriously. It just isn’t quite the breeze it seems it’ll be when coming from something like Ruby. I can’t help but think that they are consequences of the lower level nature of Go. And maybe that’s the interesting part of this article.

If I’m lucky, I’m totally wrong about them all, and someone can set me straight.

Nonconformity with Uniform Access Principle

The Uniform Access Principle (UAP) was articulated by Bertrand Meyer in defining the Eiffel programming language. This, from the Wikipedia Article pretty much sums it up: “All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation.”

Or, alternatively, from this article by Meyer: “It doesn’t matter whether that query is an attribute (representing an object field) or a function (representing a computation); this is an internal representation decision, irrelevant to clients accessing objects through calls such as [ATTRIB_ACCESS]. This “Principle of Uniform Access” — it doesn’t matter to clients whether a query is implemented as an attribute or a function”

Languages like Eiffel, Ruby, Common Lisp all satisfy the UAP, as does Python in it’s own way. Languages like C, C++, and Go do not.

In Go, the usual way of accessing an attribute of a struct is something like:

Person.Name = "Jack"
fmt.Println(Person.Name)

There is no opportunity to get in there and change how the value of the attribute is obtained. Of course, it’s possible to take a getter/setter approach:

Person.SetName("Jack")
fmt.Println(Person.GetName())
fmt.Println(Person.Name)

but this does not prevent the third line.

The problem with this is that when developing a framework you want to minimize the error-prone boiler plate that the user of the framework has to write. It is also desirable to minimise the non-idiomatic code (the Person.GetName() is non-idiomatic in Go).

Lack of Struct Immutability

By “immutability” I mean that there’s no way for a framework to prevent changes to structures that are supposed to be managed by the framework. This is related to Go’s non-conformity to the UAP (above). I don’t believe there’s a way to detect changes either.

This can probably be done by implementing immutable/persistent data structures along the line of what Clojure did, but these will not have literal forms like Arrays and Maps do now in Go. Once again, this is, in part, an issue of non-idiomatic use of Go.

Lack of Optional/Default/Named Function Arguments, and No Overloading

A function has a single signature, and that signature does not support either optional or named arguments. The consequence is that all arguments must always be provided in the function call. This is a potential source of errors and requires a lot of knowledge on the user’s part to know what the suitable “don’t care” values might be. When there usual usage involves a small number of parameters with the remaining being ‘the usual’ then we’ve also got more complex function calls that would be necessary. There’s a maintenance issue here as well since, effectively, the function calls are over specified. When the API has to change you have to consider each call much more carefully.

Library writers sometimes try to use the “zero” value for the arguments, recognize that when passed in, and substitute in the actual default value. This separates the code as it appears in usage from the meaning of the code, at the very least this is a documentation problem, and could represent a real source of error.

It’s possible to use a literal map to pass optional parameters. To do this the user will be expected to write something like:

type nparams map[string]interface{} // Probably defined once in the framework
...
SomeFunction(1, nparams{"p1": 2, "p2": "hello"})

Not pretty, but not completely disgusting either. The nparams is just a type alias that shortens the code and removes some gratuitous ugliness. Without the map of string to interface{} you’d be stuck with a single type of named argument. This technique makes the implementation of SomeFunction pretty ugly (but that’s where ugliness belongs if it has to be somewhere).

The usual safest way to handle this is to offer several different functions with different parameters and different names. This, of course, makes the APIs more complex and all the bad things that flow from that have to be contended with.

Unyielding Enforcement of ‘Unused’ Errors

The implementors of Go have made a couple of rules that I’m very happy that they’re enforcing. Two in particular: “there are no compiler warnings, only errors”, and “unused things (variables, imports, etc.) are errors”.

The annoying part is that a Go programmer can’t get around this even temporarily. So if there’s something they are trying to experiment with (i.e. figure out how it works) it isn’t always possible to just comment out surrounding code. Sometimes the commenting makes variables or import statements unused… and it won’t compile. So the programmer has to comment out still more code. This can happen a few times before it successfully compiles. Then the programmer has to uncomment the stuff just commented out. Error prone, messy, etc.

There are hacky ways to get around this, but these allow unused things to end up in the final code if programmers are careless/forgetful/busy/rushed.

This is really just an annoyance but it stands out, to me at least. Here’s a compiler that comes with tools for sophisticated functions like formatting, benchmarking, memory and CPU profiling, and even “go fix” yet they leave us temporarily commenting out code just to get it to compile? Sigh.

Lack of Go Routine Locals

There is no way to store Go routine specific data that’s accessible to any function executing in the go routine. This would be analogous to thread locals, but thread locals don’t make a lot of sense, maybe no sense at all, in Go. Yes, this is just an inconvenience but, again, the only ways that I’ve come across that get around it require error-prone, non-idiomatic, boiler-plate code.

Lack of Weak References

There’s no direct support for weak references in Go. There are tricks that can be done with finalizers that allow you do something to address the problem, or a variation of it at least (kinda like a limited form of strong pointers), but it isn’t the same. This is, again, an inconvenience, but it’d be nice. I’ve worked around it so far by having a go routine manage the resource, but once again error-prone, non-idiomatic, and boiler-plate code is involved.

Limits to Polymorphism and Method Dispatch

This is a big issue for me, I think the biggest so far. I want to avoid the type system jargon as much as possible.

I hope this makes sense.

Here’s the code:

The same code is on play.golang.org where you can actually execute it.

The output is:

Base...
Step1AsMethod (*main.Base)
Step2AsMethod (*main.Base/*main.Base) 0xf8400240a0 &main.Base{Thing:main.Thing(nil)}
Step1AsFunction (*main.Base) &main.Base{Thing:main.Thing(nil)}
Step2AsMethod (*main.Base/*main.Base) 0xf8400240a0 &main.Base{Thing:main.Thing(nil)}

Derived...
Step1AsMethod (*main.Base)
Step2AsMethod (*main.Base/*main.Base) 0xf840024230 &main.Base{Thing:main.Thing(nil)}
Step1AsFunction (*main.Derived) &main.Derived{Base:main.Base{Thing:main.Thing(nil)}}
Derived Step2AsMethod (*main.Derived)

The trouble is illustrated in the output of the Derived struct. When Step2AsMethod is run from Step1AsMethod it is the method defined for the Base struct that is run, not the one defined for Derived. When the Function Step1AsFunction is run the correct Step2AsMethod is called.

What is happening here is that the parameters to functions are polymorphic in interfaces and the runtime type/struct information is used to select the method. In the case of methods, the dispatching type/struct is the type used not the runtime struct/type, and so Step2AsMethod for the base struct/type is called.

What cannot be achieved is something equivalent to this simple Ruby program:

Of these problems, this one is my biggest concern. I’ve already run into it several times and have been able to work around it, but barely… had me sweating for a while.

I’m afraid that if I get trapped with this somehow the only way out will be a massive unmaintainable, ugly, error-prone type switch.

 Posted by at 5:05 pm

Comparing JSON and XML as DataFormats… Again

 Golang, XML  Comments Off on Comparing JSON and XML as DataFormats… Again
Jun 222012
 

This morning I was reading a post titled: JSON versus XML: Is JSON Really Better than XML?. I already have an opinion on this issue, so why do I read this stuff? Anyway, I found my self getting a little annoyed. This afternoon, a wonderful warm sunny Friday afternoon, I started wondering why I was annoyed. Maybe there’s stuff to criticise in the article, but, aside from one of his comments, there’s nothing egregiously wrong in the article. But annoyed I am.

I think I know what the problem is. The author (Isaac Tayor) is trying to examine the suitableness of JSON and XML for use as a data format. He’s considering these criteria:

  • readability
  • conciseness
  • parsing time

Fair enough. Good idea even.

First we are presented with some XML:

If I recall, I’m already annoyed. Then we are presented what is purported to be equivalent JSON that he writes as:

From here on, his analysis assumes the equivalence of these representations.

Well. They aren’t equivalent. XML is a significantly more powerful format than JSON, and the power has a cost. There are alternative representations that are, in my opinion at least, more suitable. I believe the question should be if XML’s power pays off as a data representation, especially when compared to JSON. But let’s not use the heaviest, ugliest, form of XML that we can imagine (the only thing to make it worse would be adding in some namespaces, or maybe an embedded DTD). And this ugly format is what everyone seems to use, I don’t want to single out Isaac here.

Let’s try something a little nicer (but still not equivalent):

The difference is in using elements+content vs. attributes for, well, attribute data. Certainly fair in a dataformat. The description element is worth paying attention to since it’s not an attribute as I’ve written it. In XML the values of an attribute are subject to Attribute-Value Normalization, so I’ve found that it’s best to write text as content of an element rather than as an attribute value where whitespace matters. I’m assuming that whitespace matters in the description but not the other attributes.

So what have we got here?

Readability? Highly subjective but I’d say pretty comparable. I happen to prefer the XML, but maybe that’s because I’m used to it.

Conciseness? If we get rid of unnecessary whitespace (the indentation) then we’ve got:

File Bytes
book.json 358
book-nicer.xml 352
book.json.gz 269
book-nicer.xml.gz 269

That’s purely a co-incidence with the compressed sizes. But, what can I say? And you’ll notice that the XML is shorter, and it’d be even shorter still if we didn’t care about whitespace in the description.

Speed of parsing. Isaac’s benchmarking used Java with the built in XML parser and GSON from Google for the JSON parsing. I’ve got a feeling they aren’t quite doing the same thing. On top of that micro benchmarks on the JVM are really hard. See these related pages for more some of the issues and a handy library:

Since I’m doing this largely on a whim, and I’m at the moment somewhat interested in Google’s Go, and as it happens, especially it’s XML and JSON parsing. I’ve written some code main.go and bmark_test.go. Go has provided libraries to handle JSON unmarshalling and XML unmarshalling and decoding. I’ve put a bit of code into that main.go file to illustrate.

Update: I simplified the main.go a bit, the JSON and XML are now populating the same BookInfo struct.

XML decoding means more-or-less handling raw xml events. In the case of Go, the decoder is similar to a pull parser—you ask for the next event—rather than a SAX parser that pushes events at a handler that you provide. In both JSON and XML the Unmarshaller will actually stuff values into fields of structures. It’s convenient, but for the kind of thing that I do relatively rarely of interest. This is almost certainly not the usual preference among Go programmers, just mine.

Go provides a benchmarking capability as part of its testing facility, so I’ve added a benchmark using that. The outcome…

Benchmark Iterations Performance
JSON   50000 52142 ns/op
XML   20000 84950 ns/op
XML Decode 100000 29083 ns/op

They are all pretty fast.

So, at least in Go, XML is fastest if you Decode, slowest if you unmarshal, and JSON is in the middle. I also make all the caveats that have to be made with quickly constructed benchmarks. I hope that it’s at least indicative of something useful.

So in summary:

  • readability, we have our opinions, we can disagree here
  • conciseness: it’s not so obvious that JSON is more concise
  • speed of parsing: depends on how much help you want from your tools. XML’s tools are slower, as I’d expect given that they have to deal with a much more complex data model.

What I really wanted to get across here is that XML doesn’t have to be totally disgustingly ugly. And thanks Isaac for the excuse for a pleasant afternoon of mucking about :-)

 Posted by at 5:27 pm
Dec 172009
 

I’ve been experiencing grief trying to work with Ruby based tools to build a web-based server (in Part One) and a client (read on).

So. I have to write some Ruby code to act as a client to a remote server using HTTP GET/POST requests, where parameters are sent using the usual HTTP techniques. Just like a form in a browser would do. No XML. No JSON. Just good old HTTP.

I touched on one particular aspect of the server side of things in Part One.

For maybe nine months we’d been using the HTTP client built into Ruby but a few weeks ago started experiencing serious perfomance issues. If it’s bad enough to interfere with development then it certainly won’t work in production. So we had a look around and settled on Patron. Patron is a nice little gem wrapping libcurl. It is very fast, certainly fast enough for our purposes, has proved reliable, and easy to use. We’ve been using it in production, well, we were until this afternoon but the timing is co-incidental.

Patron has a flaw. (Yeah, I’m going to pick on Patron, even though I seriously doubt it is alone here.)

You see, Patron takes its post data in a hash map. Of course hash maps have unique keys, so if multiple values are to be sent you have to associate an array with the key. Or so we supposed. Yeah, we supposed wrong. Patron accepts that form of data but does something seriously wrong with it. It basically converts the array to a string, a string that looks exactly like an array looks when printed in Ruby. This means that multiple values are silently converted into a single value. Secondly that the single value looks just like the array would look when printed in a debugging log.

Patron then sends the data to the server, which, correctly, sees a single value. This is not how even the monkey patched Rack recognises multiple values (Rack parses the key looking for indicators of multiple values, not the value itself.)

So the server gets the wrong thing, and since one is normally a perfectly valid count for a multiple valued parameter, there are no errors reported and, once again, the debugging log looks perfectly okay. But nothing works.

Sigh.

Anyway, like I said, Patron is a wrapper around C and after a quick perusal of the code it looks as though it flattens the value in C not Ruby, and anyway I’ve never been successful monkey patching a method that maps to the Ruby FFI.

Off I go looking for an alternative.

We ended up using typhoeus. Very nice indeed, all the good things about Patron including wrapping libcurl, but it actually generates a proper post. And our client works now.

Paul Dix wrote typhoeus. He also wrote Feedzirra a really nice feed reading library. Not bad Paul. Two excellent gems.

I’ve put some code in this gist that will demonstrate the issue. Have a look if you’re interested.

 Posted by at 8:07 pm

Ruby/Rack and Multiple Value Request Param Pain — Part One

 Ruby, Tools I use  Comments Off on Ruby/Rack and Multiple Value Request Param Pain — Part One
Dec 162009
 

Today, in quick succession, I encountered a frustrating situation with each of two tools. It was the same problem in both cases—dealing with multiple value request parameters. There’s something you think about every day. Anyway I was tracking down a problem that has been causing us grief for weeks now, but we never spent the time to actually deal with it. What makes this even more annoying is that both these tools (Rack and Patron) are otherwise really pretty good.

Here’s the situation. The HTML select element supports multiple selections. When the browser posts to the server, the multiple selections are passed back as key-value pairs with the key repeated once for each selection.This is standard HTTP POST behaviour.

I’m going to pick on Rack in Part One. This was especially annoying since a) Rack is so solid; b) Rack is infrastructure that you expect to just work. In other words, this was a surprise.

Rack wants to store parameters in a hash map, which is a perfectly reasonable Ruby thing to do. The thing is, hash maps have unique keys. So if you simply assign the keys and values sent from the browser you’ll write over the previous value, and in the end have only one value (the last one selected). This is what Rack does unless you use naming conventions to signal your desire for an array result. So if you named your <select> as ‘several’ then you’ll never get more than one key-value in the Rack params object. Doesn’t matter how many there were, you get zero or one. If, however, you named your select as ‘several[]’ you’d get a key mapped to an array with all of the values (or so it seems, I didn’t bother trying). There is a more complex naming convention supported as well, but I’ve got little interest in that.

This is not what I expected. And it sure isn’t what I need.

This approach presumes you actually have control over the names of those select elements. As it happens in the application we’re developing we don’t have any such control. Moreover, if you are trying to implement a REST API that uses HTTP post for communication (rather than something like XML or JSON) then you’ve got the potential for similar difficulties, and there’s no way you’ll have control over the names on a published API.

So how to fix this? Well, this is where Ruby comes to the rescue, well supported by the rather clean implementation of Rack.

You monkey patch Rack. A lot of people consider monkey patching a Very Bad Thing. They are wrong. So what’s monkey patching? When a client of a library (that’d be our application, or the examples in this gist) redefines a method, or extends a Class, or something similar, the client is monkey patching. In Ruby we’ve had a lot of mileage out of Ruby’s support for this. It is used to extend the language, and to fix things. Like this problem.

The relevant lines are 15 through 22 in the gist. In Rack’s code base these lines correspond to a single line that overwrites existing key-value mappings. In the monkey patch, I’m allowing for a mapping from a key to an array of values. The rest of the code is untouched and so the other naming conventions will be as before.

The other two files are just two test cases, one without, the other with the monkey patch.

Anyway, that fixes the first issue. The second, I’ll talk about in Part Two.

 Posted by at 9:50 pm