Tag Archives: java

Brain, refactored

For a long time I was too busy to learn anything much new, while running a company and caring for two small children. But more recently, I had the time to catch up on some technical knowledge and I remembered how much I love pure learning.

This post seems to be a big list of what I’ve learned over the last couple of years. Actually, I meant to write it a year ago, so now it’s also a list of what I need to revise.

C++

I caught up with the C++ renaissance by reading Effective Modern C++. and watching heaps of CppCon, C++Now, and Meeting C++ conference videos, such as CppCon 2014: Herb Sutter’s Back to the Basics: Essentials of Modern C++ Style, (the one with almost always auto).

I exercised my growing knowledge of modern C++ by first using some modern C++ in gtkmm and Glom, and then doing a massive rewrite of libsigc++. I also explored some modern C++ generic programming techniques with tuples and dynamic programming.

Along the way I experimented, in Glom’s code, with some of the ideas from Sean Parent’s talks, such as the “no naked for loops” idea from his “C++ Seasoning” talk from Going Native 2013.

I read the old Boost Graph Library (BGL) book, to tie C++ and graph theory together in my head. I spent lots of time, off and on, trying to clean the BGL code up, and modernize it, but it’s a struggle and I’ve only managed to get some simple stuff accepted so far.

The BGL book is ancient but its ideas about C++ concepts are only just about to become mainstream, hopefully in C++20. I also read Alexander Stepanov’s  wonderful From Mathematics to Generic Programming book, which convinced me even more.

I tried to get a simple explicit operator bool check into clang-tidy itself, but that didn’t get far and I chose not to be a pain by being pushy about such an unimportant contribution.

I started to learn about some compiler theory by writing some actual code.

I listened to the complete backlog of CppCast podcast episodes and I now listen to every episode as soon as its published.

Java and Android

I revised and updated my Java skills by reading Effective Java Programming and Programming Android. I regularly listened to the Fragmented and Android Backstage podcasts, but not so much recently. I gained some real-world Android programming experience by creating the Android app for Galaxy Zoo after first trying to create an Android app for my Glom database system. The Galaxy Zoo app was quite popular, but sadly, it’s not working right now – I need to find time to adapt it to their changed  backend API.

Java and web

I spent some time with Java as a way to work on backend (distributed) systems, reading the JAX-RS book, the,Spring Microservices in action book, and the Spring in Action book.

I gained some more Java/GWT/AppEngine (and Resty-GWT) experience by creating the Big-O Quiz website. I  used bigoquiz.com to take organized notes about the computer science that I learned subsequently. It asks me questions to remind me. But see below about me rewriting bigoquiz.com with Go and Angular.

iOS / Objective-C

I learned iOS (iPhone/iPad) programming by watching all 18 lectures in Stanford’s “Developing iOS 7 Apps for iPhone and iPad” course and then creating the iOS app for Galaxy Zoo. I used Objective-C, but now I have a reason to learn Swift.

Like the Android app, this is currently broken, because the backend server now has a very different API.

Go

I learned Go via the The Go Programming Language book, taking extensive notes from the perspective of a C++ developer. To gain initial experience, I rewrote bigoquiz.com‘s backend with Go, instead of Java, and found the resulting code refreshingly simple.

I now use Go regularly at work for backend microservices and I’m quite content with it. For larger projects I’d still prefer C++, but not all projects need to be large.

Angular

I finally gave up on GWT, long after everyone else realised that it’s barely maintained. So I took the opportunity to rewrite the bigoquiz.com frontend in Angular, using Typescript. Learning it was easy with the excellent Angular tutorial.

I’ve since tried out React too.

Scala

I read the huge Programming in Scala book, because I use Scala on a large project at work. I’m fond of it, but I think its simplest ideas should just be in Java, and the cleverest stuff tends to make code hard to read.

Algorithms and Data Structures

I grew to love computer science by doing Coursera’s/Stanford’s “Algorithms: Design and Analysis” course, part 1 and part 2 by Tim Roughgarden. I then read Cracking The Coding InterviewThe Algorithm Design Manual, Algorithms, by Robert Sedgewick, and the first half of the CLRS Introduction to Algorithms book. I worked through the first few lectures of Tim Roughgarden’s “A second course in Algorithms” to cover max-flow and min-cost matching algorithms.

I went through the material 2 or 3 times, from various perspectives, eventually making sure to actually implement each data structure and algorithm.

I also read Jon Bentley’s Programming Pearls book, for some practical advice.

I watched, and re-watched, far too many of Tushar Roy’s algorithm walkthrough videos.

Linux

I already have years of Linux development experience, but I knew I was missing out on some knowledge and habits. So I solidified my Linux knowledge by reading Linux System Programming. (And now, later, I think I should read it again, though I wish there was a newer edition.)

I’m still looking for a really good book about bash scripting and general command line cleverness.

Distributed Systems

I didn’t have practical experience with at-scale server-based systems but I became acquainted with the ideas by reading Google’s Site Reliability Engineering book.

I discovered it later, but Tim Berglund’s 4-hour Distributed Systems in One Lesson series of videos  is an excellent and engaging overview of the main themes. If it’s no longer available on YouTube but I think it’s so good that you should buy it.

Mikito Takada’s free Distributed Systems For Fun and Profit book is also an excellent introduction and overview.

I also listened to many old episodes of the Software Engineering Radio podcast and the Software Engineering Daily podcast, which often cover similar topics.

Performance

Even as a regular C and C++ coder, I realized how little I thought about the computer architecture that my code is running on. Various talks by Andrei Alexandrescu and Chandler Carruth really opened my eyes to these performance considerations that seem more relevant in at-scale server code than they were in typical desktop applications or even in embedded code.

For instance, Andrei Alexandrescu’s Code Dive 2015 “Writing Fast Code: part 1” and part 2 talks, as well as his “Optimization Tips – Mo’ Hustle Mo’ Problems” talk at CppCon 2014 and his “Writing Quick Code in C++, Quickly” talk from Going Native 2013.

For instance, Chandler Carruth’s “Understanding Compiler Optimization” talk from Meeting C++ 2015, his “Tuning C++: Benchmarks, and CPUs and Compilers! Oh My!” talk from CppCon 2015, his “Efficiency with Algorithms, Performance with Data Structures” talk from CppCon 2014, and his “Care and Feeding of C++’s Dragons” talk from Going Native 2013. Also his more recent “High Performance Code 201: Hybid Data Structures” from CppCon 2016.

Mike Acton’s famous “Data-Oriented Design and C++” talk, from CppCon 2014, really gets to the point about this too, from more of a C perspective.

Coding Habits

I finally took the time to learn Vim enough to use it as my daily editor, after reading Practical Vim. I also learned to use screen and Tux but I haven’t incorporated either into my daily habits.

Ironically I never found my Linux environment outside of the terminal to be as keyboard friendly as the old pre-X MacOS. So staying in the terminal helped me to keep my hands off the mouse again. I practiced with the Klavaro typing tutor to get some of my good habits back.

I worked through the first 25 tasks from Project Euler and most of the tasks from HackerRank. I spent two intensive months working through every task on InterviewBit (around 260 at the time).

I highly recommend InterviewBit – they have a vast set of coding problems, nicely organised, with helpful test cases and tips for when you get stuck, so you can actually improve. HackerRank is useful too, but it feels less directed and the need to parse stdin input for every task gets annoying.

Along the way, I took a note of each C++ typo or generic algorithm mistake that I made, and started counting how many times I made each mistake, so I could focus on breaking my worst habits.

Management

I read some books about project and people management too, revising agile processes with the Scrum and XP from the Trenches and Kanban and Scrum: Making the most of both free ebooks.

The Manager’s Path maps out the various roles that are popular at software companies today, with suggestions about regular actions that each might perform, but it doesn’t seem useful beyond establishing that common language, and I hope it becomes outdated.

I recently read Measure What Matters, and Radical Focus, about OKRs, and I’m very enthusiastic so far. I like the idea of openness and regularity in businesses, to allow focus and alignment, almost like well-functioning decentralized open source projects that I’ve known.

I also got interested in product management, but I haven’t taken it far yet. The Cracking the Product Manager Interview book was rather superficial. I hope to dig deeper into this topic in future.

Others

I revised my Design Patterns knowledge by reading Head First Design Patterns and I used it to build a Design Patterns quiz.

I learned the Lua programming language and the Moai game-development platform via the Programming in Lua book and the “Developing Mobile Games with Moai SDK” book. This was for an idea I’d had for a kids’ programming app that didn’t seem quite worth the effort in the end.

I learned a bit about Hadoop by reading the “definitive guide” book, but I never got around to using it.

GWT: Changing from GWT-RPC to JSON with RestyGWT

GWT but with REST and JSON

My bigoquiz.com website uses my gwt-bigoquiz project, based on Java and GWT. One advantage of GWT is the serialization between the server and client. Using GWT-RPC, you can use the same Java classes on the server and the (compiled to JavaScript) browser client. (Well, at least when it works – when it doesn’t work it can be very hard to find out why). For instance, your Java service can have a getThing() method which your client code can call asynchronously, getting a Thing. It avoids duplication and avoids manually writing code to parse an intermediate format and recreate objects.

However, GWT-RPC ties you into using GWT on both the server and client. I would rather have a standard REST API that serves, and receives, JSON. I could then experiment with alternative implementations independently on the server side or the client side.

RestyGWT, together with Jersey, makes this quite easy. I have now ported gwt-bigoquiz from GWT-RPC to RestyGWT and Jersey. Here I describe how you might do that for your project.

You might also like to watch David Chandler’s talk about RestyGWT and Jersey at GWT.Create 2015 (slides, GitHub), which covers the same stuff in a little more detail, using slightly older versions.

Build

We need to add the following dependencies to our pom.xml file, so we can use RestyGWT, Jersey, Jackson and the JAX-RS annotations.

I’ve added comments to explain why we need each dependency, based on my best guesses. Please let me know about any errors you find. pom.xml files too often just contain cargo-culted blocks of XML pasted from blog entries or StackOverflow answers, far removed from whoever first wrote them. Still, for later versions of these dependencies, or different environments, you might need a different combination of versions or different workarounds.

<!-- For JAX-RS annotations, such as @GET, @POST, @Path, @PathParam, @QueryParam, etc -->
<dependency>
	<groupId>javax.ws.rs</groupId>
	<artifactId>javax.ws.rs-api</artifactId>
	<version>2.1-m09</version>
	<!-- Note: 2.0 gives us a ClassNotFoundException about RxInvokerProvider when we try to GET from the URL. -->
</dependency>

<!-- For client-side code to query a REST/JSON server API. This uses methods and classes annotated with JAX-RS annotations, doing serialization/deserialization with Jackson. -->
<dependency>
	<groupId>org.fusesource.restygwt</groupId>
	<artifactId>restygwt</artifactId>
	<version>2.2.0</version>
</dependency>

<!-- For JSON serialization/deserialization based on the JAX-RS annotations. Used by RestyGWT on the client side. Also adds some annotations such as @JsonInclude and @JsonIgnore, -->
<dependency>
	<groupId>com.fasterxml.jackson.jaxrs</groupId>
	<artifactId>jackson-jaxrs-json-provider</artifactId>
	<version>2.8.9</version>
</dependency>

<!-- To serve REST/JSON queries of REST resources.
	 A <servlet> tag in the web.xml file indicates that Jersey should use certain classes as REST resources/servelets.
	 These REST resources/servlets use Java methods and classes annotated with JAX-RS and Jackson annotations.

	 jersey-container-servlet needs the Java Servlet API version 3, supported by the AppEngine Java 8 runtime (currently beta).
	 Alternatively, jersey-container-servlet-core needs the Java Servlet API version 2, supported by the AppEngine Java 7 runtime. -->
<dependency>
	<groupId>org.glassfish.jersey.containers</groupId>
	<artifactId>jersey-container-servlet</artifactId>
	<version>2.26-b07</version>
</dependency>

<!-- To let Jersey use Jackson. -->
<dependency>
	<groupId>org.glassfish.jersey.media</groupId>
	<artifactId>jersey-media-json-jackson</artifactId>
	<version>2.26-b07</version>
</dependency>

<!-- Workaround this error: java.lang.IllegalStateException: InjectionManagerFactory not found. See https://stackoverflow.com/a/44546979/1123654 -->
<dependency>
	<groupId>org.glassfish.jersey.inject</groupId>
	<artifactId>jersey-hk2</artifactId>
	<version>2.26-b07</version>
</dependency>

Configuration

We must edit some XML files so we can use RestyGWT and Jersey.

.gwt.xml Module file

Our .gwt.xml Module definition must mention RestyGWT inside the tag, so our client code can use RestyGWT.

<inherits name="org.fusesource.restygwt.RestyGWT"/>

web.xml file

We must add <servlet> and <servlet-mapping> tags in the web.xml file to state that Jersey should use our classes to serve REST resources as servlets.

<servlet>
    <servlet-name>restServlet</servlet-name>
    <servlet-class>org.glassfish.jersey.servlet.ServletContainer</servlet-class>
    <init-param>
        <param-name>jersey.config.server.provider.packages</param-name>
        <param-value>com.murrayc.myproject.server.api</param-value>
    </init-param>
</servlet>
<servlet-mapping>
    <servlet-name>restServlet</servlet-name>
    <url-pattern>/api/*</url-pattern>
</servlet-mapping>

Server-Side Code Changes

The REST Resource

The classes for our REST resources must be in the package specified in the web.xml file., where Jersey can find them. These correspond roughly to the *ServiceImpl classes, derived from GWT’s RemoteServiceServlet, that we used with GWT-RPC.

These classes don’t need to derive from any special base class or implement any special interface. They do need to be annotated with @Path to specify the first part of the name of the resource. For instance:

@Path("thing")
public class ThingResource {
    ...
}

We then annotate methods of the class with @GET or @POST and @Produces.

Without a @Path annotation on the method, the method matches the base path of the class. For instance, this might return all available Things, when a client GETs from the thing/ URL.

@GET
@Produces("application/json")
public Collection<Thing> get() {
    ...
}

We can use additional @Path annotations on the method. For instance, this might return a Thing with a matching Thing ID, when a client GETs from the thing/123 URL, for instance.

@GET
@Path("/{id}")
@Produces("application/json")
public Thing getById(@PathParam("id") String id) {
    ...
}

Note that the Java method name does not appear in the URL.

The ServletContext

In our GWT-RPC service classes, we could get the ServletContext via getServletConfig().getServletContext(). With our REST classes, we instead use the JAX-RS @Context attribute on a member field, like so:

@Context
ServletContext context;

Jersey apparently then assigns the context at runtime.

Client-Side Code Changes

With GWT-RPC, we had a client interface like this:

public interface ThingServiceAsync {
    void getThings(AsyncCallback<List<Thing>> async);
    ...
}

which we called like so:

AsyncCallback<List<Thing>> callback = new AsyncCallback<List<Thing>>() {
    @Override
    public void onFailure(Throwable caught) {
        ...
    }

    @Override
    public void onSuccess(List<Thing> result) {
        getView().setThings(result);
    }
};

ThingServiceAsync.Util.getInstance().getThings(callback);

With RestyGWT, we’ll instead have a client interface like this:

@Path("/api/thing")
public interface ThingClient extends RestService {
    @GET
    public void getThings(MethodCallback<List<Thing>> callback);
    
    @GET
    @Path("/{id}")
    public void getThing(@PathParam("id") String id, MethodCallback<Thing> callback);
}

And we’ll use it like so:

MethodCallback<List<Thing>> callback = new MethodCallback<List<Thing>>() {
    @Override
    public void onFailure(Method method, Throwable caught) {
        ...
    }

    @Override
    public void onSuccess(Method method, List<Thing> result) {
        getView().setThingList(result);
    }
};

Defaults.setServiceRoot(GWT.getHostPageBaseURL());
ThingClient client = GWT.create(ThingClient.class);
client.getThings(callback);

So to change the GWT-RPC client code to RestyGWT client code, you only need to:

  • Create the client interface.
  • Use it instead of the GWT-RPC Async client class.
  • Change AsyncCallback to MethodCallback.
  • Add the Method parameter to onFailure() and onSuccess().

Parameters

A method of a GWT-RPC service just takes Java parameters. But for a REST service, you need to decide whether these are path parameters, or query parameters. For instance, if the URL is /things/123?color=blue then there is one path parameter (123), and one query parameter (color, with value blue). You would indicate this in your service class like so, using the JAX-RS attributes:

@GET
@Path("/{thing-id}")
@Produces("application/json")
public Thing getById(@PathParam("thing-id") String thingId, @QueryParam("color-id") String colorId) {
    ...
}

and in the client interface like so:

@Path("/api/thing")
public interface ThingClient extends RestService {
    @GET
    @Path("/{thing-id}")
    public void getThing(@PathParam("thing-id") String thingId, @QueryParam("color-id") String colorId, MethodCallback<Thing> callback);
}

Ignoring some Getters

Not all get methods should result in items in the JSON. You can avoid this with the Jackson @JsonIgnore annotation. For instance:

public class Thing {
    ...

    @JsonIgnore
    public getIrrelevantStuff() {
       ...
    }
}

Ignoring null and empty objects

To keep your JSON small, you can use Jackson’s @JsonInclude attribute to stop Jersey from writing name/value pairs with null or empty values. Of course this can obscure the JSON format by hiding its possible contents. For instance:

@JsonInclude(JsonInclude.Include.NON_EMPTY)
public class Thing {
    ...
}

You might choose to use NON_NULL rather than NON_EMPTY.

Adding Setters

If our REST GET method returns an instance of Thing, Jersey will serve a JSON string based on any public get*() methods that take no parameters, and the public fields. So, this class,

public class Thing {
    public String id;
    public getStuff() {
       ...
    }
}

would produce JSON like this, recursing into contained class instances:

{
  "id": "123",
  "stuff" {
     "foo": true,
     "bar": 789
  }
}

However, if there are no corresponding setter functions, such as setStuff(), RestyGWT (or Jackson, which it uses) will silently ignore these parts of the JSON when deserializing the JSON into a client-side Java instance. If there is a way to detect this at runtime, I’d like to know about it.

This makes it particularly important to test the serialization and deserialization. For instance:

@Test
public void ThingJsonTest() throws IOException {
    // The Jackson ObjectMapper:
    ObjectMapper objectMapper = new ObjectMapper();

    // Get the JSON for an object:
    Thing objToWrite = new Thing();
    objToWrite.setFoo(2);
    // etc.
    
    final String json = objectMapper.writeValueAsString(objToWrite);
    assertNotNull(json);
    assertFalse(json.isEmpty());

    // Get an object from the JSON: 
    Thing obj = objectMapper.readValue(json, Thing.class);
    assertNotNull(obj);
    assertEquals(2, obj.getFoo());
    // etc.
}

Removing the GWT-RPC code

Once everything is working, we can remove the unused GWT-RPC classes and configuration:

  • The ThingService interface, which extends RemoteService.
  • The ThingServiceAsync interface.
  • The ThingServiceImpl class.
  • The <servlet> and <servlet-mapping> tags from web.xml

A C++ developer looks at Go (the programming language), Part 3: Concurrency

This is the third of my posts about looking at Go from the perspective of a C++ developer. I think that will be enough for now. Here are all 3 parts:

Here I mention Go’s built-in concurrency features, letting you write code that makes the best use of one, or more, processors. I am not an expert in concurrency, and I think few people are. I’ve had some really helpful friendly feedback about the previous parts from Go developers, so I hope that continues. I will gladly correct any mistakes that are pointed out to me so the text can become more useful.

However, the mentions of C++ coroutines here might also be particularly useless. Clang and gcc don’t support coroutines in their stable releases yet and it’s hard to find definitive explanations. What I can find is contradictory and far from as  simple as Go, probably because the C++ coroutines specification has been in flux for a while. I would very much welcome corrections. Interest in C++ coroutines seems to be heating up now so I guess this will settle down soonish. I think it will then be time for some books to be updated. I hope that the end result is at least as expressive as Go but I am not confident of that so far.

As before, I strongly suggest that you read “The Go Programming Language”, (referred to her as “the book”), from which I’ve learned this stuff. At least at first, the structure of this text is aimed at people familiar with C++ who are interested in Go, but the comparison does break down later.

Goroutines and std::task

Like C++ (since C++11), Go has support for simple concurrency features such as lockable mutexes (which we’ll see later), but it doesn’t let you create raw threads. Instead if offers goroutines.

Goroutines are, at their simplest, functions that can run concurrently with other functions. Two goroutines might even run on different CPUs, meaning that they run truly simultaneously. You can start one just by using the “go” keyword before a normal function call. The function will then execute, without the caller waiting for it to complete.

go doSomethingThatTakesALongTime()
doSomethingElseAtTheSameTime.

While the goroutine does its work, maybe using traditional cooperative multi-threading on the same CPU, maybe on another CPU, the implicit main goroutine can also do work. Update: But careful. As mentioned later, this program won’t wait for the first goroutine to finish if the main goroutine finishes first.

So far, at least, this is a bit like starting a std::task in C++ with std::async() (since C++11):

auto future = std::async(doSomethingThatTakesALongTime)
doSomethingElseAtTheSameTime()
future.wait();

(Update: As James pointed out in a comment, if you don’t keep the returned future, its destructor will run, which will cause a wait, making the call not async.)

Hybrid Threading (m:n Threading)

A std::task in C++ is not guaranteed to start an actual new OS (kernel) thread, unless you pass std::launch::async as an extra argument. It will sometimes, but that’s up to the runtime and the implementation. I think the runtime, in some implementations, might even be able to move tasks to real threads and between threads, but I’m not sure. Scott Meyer’s “Effective Modern C++” has some good discussion of tasks and C++ concurrency in general.

Goroutines are not ambiguous about this. Goroutines don’t map directly to kernel (OS) threads. They use m:n threading, in which user space (the Go runtime, in this case) schedules its own m “threads”, mapping them onto a pool of n kernel threads. This seems to be what is referred to as “Hybrid Threading” in Robert Love’s “Linux System Programming” book – a mixture of 1:1 Threading (just using kernel threads) and N:1 Threading (or User-Level Threading, apparently sometimes called “green threads” or “lightweight threads”.). Presumably there is some very clever scheduling code in Go’s runtime and in the C++ coroutines implementation.

Because Goroutines don’t map directly to real threads, they don’t need the full stack size given to a kernel thread, letting us comfortably use far more goroutines than we would use threads. However, you only get the full benefit of this when the work is happening purely in the Go runtime. The book explains that “Goroutines that are blocked in I/O or other system calls or are calling non-Go functions, do need an OS-thread”. So I guess you still wouldn’t want, for instance, a goroutine per network connection, if you are serving many network clients.

Channels: Getting One Result

In C++, std::async() returns a std::future. This is a way to later wait for the result from the task, or just get the result if it is already ready, having made good use of the original thread’s time in the meantime.

int doSomethingThatTakesALongTime() {
  ...
  return 5;
}
 
auto f = std::async(doSomethingThatTakesALongTime)
doSomethingElseLengthyAtTheSameTime()
auto answer = f.get()

The go keyword doesn’t return a future. (Update: And doesn’t let you use the return value.) You could use the function’s result, but you’d just immediately start waiting for it, defeating the purpose (I wonder why Go allows this at all.)

Instead, to provide a result, a Go program would use a Go channel. Go channels are queues of your chosen elements. Channels are a built-in type, so they are generic – they can contain various types. They can be created like so, using the make() built-in function as for other built-in types:

c := make(chan string)

The Go language has special syntax for sending to, or receiving from, channels via the <- and -> operators, though I don’t see what would have been so bad about having send() and receive() methods.

Go channels are “concurrency safe”, so you can pass one to a goroutine, and both goroutines can happily manipulate it. For instance, to get a response to a long-running function, but only after we’ve tried to make good use of the waiting time:

func doSomethingThatTakesALongTime(c chan int) int {
  ...
  c <- 5
}
 
c := make(chan int)
go doSomethingThatTakesALongTime(c)
doOtherStuffInTheMeantime()
answer := <- c

That attempt to receive from the channel (with the <- operator) will block until the channel has something to provide, a bit like how std::future::get() blocks until the result is ready.

It would be fairly easy to create a future type that uses a channel, then you could write an async method that uses the go keyword and returns the future. But I think it would be nice if Go made this simple case simpler by doing this for us. Maybe it does.

I’m also a little uncomfortable with the calling goroutine (the main goroutine here) having to create the channel, repeating the type of that channel (it’s also in the called goroutines function signature). I think it would be nice if there was some syntax that gave us a new instance of the channel (Ready for receiving. See about unidirectional channels later.) that the called goroutine expected to get, while also allowing us to use an existing channel if necessary. That would be hard to get right, of course.

If the channel is empty and has been closed then it will then provide a zero value immediately. Or you can get the extra result to check if the channel is still open or still has values:

answer, ok := <- c
if (!ok) {
  // The channel is empty and has been closed.
}

Goroutines and C++ Coroutines

Goroutines feel a little like C++ coroutines (not yet in C++ as of C++17). I don’t know if their scheduling and stack management is quite as clever. I suspect not, because I keep reading stuff about C++ coroutines  being “stackless”, which I think means that they map directly to kernel threads, while Goroutines are apparently “stackful”.

Goroutines allow cooperative scheduling, via blocking channel sends and receives (see select) and sleeps. So do C++ coroutines, via co_yield, co_return, co_await, futures and generators.

So I’ll mention C++ coroutines later as I show other features of Goroutines and channels. I’d love to know the actual differences, if someone can explain it to me clearly enough that I understand. I really hope the end result is as easy to use as Goroutines.

C++ Coroutines: Getting One Result

For now, let’s look at how a C++ coroutine would let us return a single value, much like we did with std::async() (with a std::future) and with Go’s go keyword (with a channel). We would use a std::future return type for the function’s implementation, use co_await to launch the function, and co_return to return the result into the std::future.

This is based on various other blog posts and slides that I’ve seen. I don’t know if this is really correct because I don’t have a compiler that actually supports this yet.

std::future<int> doSomethingThatTakesALongTime() {
  ...
  co_return 5;
}

int main() {
  int answer = co_await doSomethingThatTakesALongTime();
  doOtherStuffInTheMeantime()
  
  // Attempting to get the value blocks until the result is ready.
  // (I'm not sure about this part. Do we need to explicitly wait
  // for the future to have a value?)
  std::cout << "answer: " << answer << '\n';
  return 0;
}

Here the co_await keyword is a little like the go keyword in Go.

Channels: Multiple Results

But channels are a queue of values, not just single values, so you could receive a series of results. Go’s range-based for loop has built-in support for this, as mentioned in section 8.4.2 (Pipelines) in the book. For instance:

func sendMessagesToOtherSolarSystemsAndGetResponses(c chan string) {
  for i := 0; i < 10; i++ {
     ...
     c <- response
  }

  c.Close()
}
 
c := make(chan string)
go sendMessagesToOtherSolarSystemsAndGetResponses(c)
doOtherStuffInTheMeantime()
for response := range c {
  fmt.Println(response)
  ...
}

fmt.Println("Not waiting any longer for more responses.")

Notice that, unlike a regular range-based for loop in Go, this gives you the value, not the index (which would be meaningless for a channel). In a regular range-based for loop, you need to get the index and value (maybe using _ for the index) to get the value.

The range-based for loop blocks at each operation, waiting for a new value from the channel, and it will stop automatically when the channel is empty and closed.

As mentioned in section 8.5 (Looping in Parallel), if we return from the main goroutine early, for instance in response to an error, we must make sure to “drain” the channel if we don’t want the other goroutines to be blocked when they try to send, even when using a buffered channel (see below). You could launch a separate goroutine just to do this.

C++ Coroutines: Multiple Results

With C++ coroutines, to return multiple results, we can use co_yield instead of co_return, using a generator instead of a std::future.

Again, this is based on various other blog posts and slides that I’ve seen. I don’t know if this is really correct because I don’t have a compiler that actually supports this yet. In particular, I don’t know if there will really be a std::generator in the standard library.

std::generator<std::string> sendMessagesToOtherSolarSystemsAndGetResponses() {
  while (still_waiting) {
    ...
    // "return" one result via the generator.
    co_yield response;
  }

  // TODO: Do we need a co_return somewhere too?
}

int main() {
  auto messages = co_await sendMessagesToOtherSolarSystemsAndGetResponses();

  // The range-based for loop waits for the next message each time.
  // TODO: When does it stop?
  for (auto response : messages) {
    std::cout << response << '\n';
  }

  std::cout << "Not waiting any longer for more responses.\n";
  return 0;
}

So far that makes a generator look like a bit like a Go channel. But Go channels seem to be more versatile, as we will see.

Unidirectional channels

By default channels allow goroutines to both send or receive messages. This is rarely what you want inside a particular goroutine, so you can declare a channel type that is only for sending or only for receiving. You would typically create a normal channel and then the goroutine’s function would declare that it takes a unidirectional channel. A send-only channel is of type chan<-, and a receive only signal is of type <-chan. I do think that syntax is hard to love.

For instance, the function from our previous example really only needs a channel that it can send messages on, not receive them too, so we should change the type of the channel:

func sendMessagesToOtherSolarSystemsAndGetResponses(c chan<- string) {
  for i := 0; i < 10; i++ {
    ...
    c <- response
  }

  c.Close()
}

The compiler can then help by preventing a message from being sent on the channel. When we pass the regular channel to the goroutine it will cast implicitly to the unidirectional channel. But we may not cast the unidirectional channel back to a regular channel.

I don’t know if there is some equivalent for this in the world of C++ coroutines and generators. Maybe it is generally unwise to compare C++ generators and Go channels.

Buffered Channels

So far we have created unbuffered channels. When we sent a message on the channel, the sending goroutine blocked until the message was received by the receiving goroutine. And when we tried to receive a message the receiving goroutine blocked until the sending goroutine had put a message in the channel. Therefore, unbuffered channels are also called synchronized channels.

But in our example, it would be better if the sending goroutine could keep adding messages to the channel without waiting for them to be received. We can use a buffered channel instead, specifying the size of the buffer:

c := make(chan string, 100) go sendMessagesToOtherSolarSystemsAndGetResponses(c)

The sending goroutine will then only block when the channel is full, and the receiving goroutine will only block when the channel is empty.

Using an Unbuffered Channel to Signal “done”

Unbuffered channels can be useful to signal that a goroutine has finished. This is mentioned in section 8.4.1 (Unbuffered Channels) of the book.

For instance, you will often want the main goroutine to wait for other goroutines to finish instead of stopping the whole program. By using an extra “done” channel, the second goroutine can signal to the main goroutine that is has finished, by sending an item on the “done” channel, and the main goroutine can block on that by attempting to receive from the “done” channel. For instance, here the goroutine does not return any responses so we need another way to signal that we have finished:

func sendMessagesToOtherSolarSystems(done chan struct{}) {
  for i := 0; i < 10; i++ {
    ...
  }

  done -< struct{}{}
}

done := make(chan struct{})
go sendMessagesToOtherSolarSystems(done)
doOtherStuffInTheMeantime()
<- done // This blocks

However, when the goroutine is already returning results via a channel, I believe it is simper to just wait for the channel to be closed, as we did in the previous example by using a range for over the channel.

The book recommends use of struct{} as the channel element, probably because it has no value. A bool seems more more concise, but you’d need a comment saying that the value (true or false) was meaningless. I guess you could also just do a range for over the channel, waiting for it to close, like when we used a channel to return responses, but without ever sending anything.

I feel that this “done” synchronization technique could have a more explicit syntax. I think that would make the code clearer and would mean that various code that does the same thing would have less uninteresting differences. Sync.WaitGroup (see below) seems to be what I want, but this raw “done” channel technique seems more popular for simple cases.

If we were using C++ coroutines, I’d guess we would do the same thing by waiting for the result from a std::future.

Waiting for Multiple Goroutines to Signal “done”

A variation of the “done” channel technique is useful when waiting for multiple goroutines to finish. If we know how many goroutines we have started, we can use a channel to wait for that many “done” messages. This is mentioned in section 8.5 (Looping in Parallel) of the book, though it doesn’t call it a “done” channel there. For instance:

func sendMessageToSolarSystem(coordinates Coordinates, done chan struct{}) {
  ...

  done -< struct{}{}
}

done := make(chan struct{}
for coord range : solarSystems {
  go sendMessageToSolarSystem(coord, done)
}
doOtherStuffInTheMeantime()

// Wait for as many done messages as goroutines that we started.
// We could have counted them instead, and then counted down here.
for range : solarSystems {
  <- done
}

As section 8.5 (Looping in Parallel) of the book says, if you wanted to return something useful from each goroutine, even just an error, you would use that instead of a struct{}.

Again, I don’t know how we would do this with C++ coroutines.

sync.WaitGroup

As an alternative to the “done” channel technique when waiting for multiple goroutines to finish, Go provides the sync.WaitGroup concurrency-safe  counter. It has Add() and Done() methods and a Wait() method that blocks until the count falls back to zero. This is mentioned at the end of section 8.5 (Looping in Parallel) in the book.

func sendMessageToSolarSystem(coordinates Coordinates, wg sync.WaitGroup) {
  ...

  wg.Done() // Doing it via defer would be safer.
}

var wg sync.WaitGroup
for coord range : solarSystems {
  wg.Add(1)
  go sendMessageToSolarSystem(coord, done)
}
doOtherStuffInTheMeantime()

wg.Wait()

Notice that there is no Add() method overload that uses a default of 1, because Go has no function overloads and no default function parameter values.

I don’t know how we’d do the same with C++ coroutines. Even clever and correct use of a std::condition_variable would not be helpful if C++ coroutines do not map directly to real kernel threads (see above).

Using a Channel as a Counting Semaphore

Section 8.8 (Example: Concurrent Directory Traversal) of the book suggests using a buffered channel of a specific size to allocate and release tokens, preventing too many goroutines from doing work at the same time, causing them to block until a token is available. In the example, this limits the use of a constrained resource, preventing the goroutines from opening too many files. For instance:

func sendMessageToSolarSystem(coordinates Coordinates, sema chan struct{}) {
  sema <- struct{}{} // Acquire token.
  defer func() { <- sema }() // Release it later.
  ...
}

// Don't try to send more than 10 messages at a time.
sema := make(chan struct{}, 10)
for coord range : solarSystems {
  go sendMessageToSolarSystem(coord, done)
}
doOtherStuffInTheMeantime()

// This ignores the need to wait for completion.

I think I would prefer some syntax or API that let me explicitly specify a pool that allows only a specific number of coroutines to be active (or maybe even instantiated). Obviously that would only be useful in simple cases and it’s generally unpleasant to hard-code numbers anyway.

Select: Multiple results, from multiple channels, without blocking

Go has a select/case construct built in to the language just for channels. This is mentioned in section 8.7 (Multiplexing with Select) of the book.

It’s a bit like switch/case. It’s also a bit like the POSIX select() function in C, used to respond to activity on file descriptors, such as network connections. It can wait for activity on one of the specified channels. Or it can repeatedly poll if you supply a default case and put it in a loop.

c1 := make(chan string)
go sendMessagesToExoplanetsViaRadio(c1)
c2 := make(chan string)
go sendMessagesToExoplanetsByBlockingTheSun(c2)
 
for {
  select {
    case response := <- c1:
      fmt.Println("Response to or radio message: %s", response)

    case response := <- c2:
      fmt.Println("Response to or sun message: %s", response)

    default:
      if (too_late) {
        break
      }
      doOtherStuffInTheMeantime()
      // or sleep.
  }
}

fmt.Println("Not waiting any longer for more responses.")

I wonder how C++ could support a similar construct with coroutines and generators.

Canceling a Goroutine by Closing a Channel and Polling

The main goroutine might no longer need the result from a goroutine that it has started. Maybe another goroutine found the answer first, or maybe it was in response to a user request that has been canceled, or a connection that has been disconnected.

(Update: See the comment below, about Contexts, added in Go 1.7.)

Section 8.9 (Cancellation) of the book suggests signaling this by closing a channel, and suggests that the called goroutines should poll for this channel closure via select with a default case (see above). The book calls this channel “done” but I think that confuses this technique with technique that sends a done message over the done channel.

For instance:


// Poll to see if the channel has been closed.
func canceled(canceler chan<- string) bool {
  select {
    case <-canceler:
      return true
    default:
      return false
  }
}
  
func sendMessagesToExoplanetsViaRadio(c chan<- string, canceler chan struct{}) {
   ...

   if canceled(canceller) {
      // Obviously not neat code:
     c <- "Not sent"
   }

   ...
   c <- response
}
  
c1 := make(chan string)
canceler1 := make(chan struct{})
go sendMessagesToExoplanetsViaRadio(c1, canceler1)
c2 := make(chan string)
canceler2 := make(chan struct{})
go sendMessagesToExoplanetsByBlockingTheSun(c2, canceler2)
 
for {
  select {
    case response := <- c1:
      // Cancel the other attempt because we don't need it.
      canceler2.Close();

    case response := <- c2:
      // Cancel the other attempt because we don't need it.
      canceler1.Close();
  }
}

Again, I would like more explicit syntax or API for goroutine cancellation. For me, being able to implement things with channels doesn’t mean that they are the best API for expressing those things.

Communicating Instead of Sharing

Go programs tend to send messages via channels to change (indirectly) shared data and to let goroutines see the latest state of that data. “Do not communicate by sharing memory; instead, share memory by communicating.” is a popular phrase in the Go world.

This allows the use of data to be restricted to a single goroutine, often called a “monitor goroutine”. The locking (or just general lack of races if the Go channel is implemented with lock-free code) is then restricted to the channels, implicitly, with no explicit locking of the data being necessary. There are some examples of this in section 9.1 (Race Conditions) in the book.

It occurs to me that, unlike locking via a mutex, this does not guarantee that all goroutines see the latest state of any shared data. But it does guarantee an order, which is maybe all that usually matters.

Because channels are concurrency-safe, you can even send them as elements of a channel. Section 8.10 (Example: Chat Server) does this so it can use “client” channels as identifications of client goroutines (as well as sending strings over the client channels), keeping track of them via entering and leaving channels and a map of the client channels.

Concurrency-safe containers

We’ve already seen that Go channels are “concurrency-safe”, so you can share them between goroutines without additional locking. So are slices and maps.

Update: No, slices and maps are not concurrency safe. I must have misunderstood something that I read somewhere. This makes the rest of this sub-section irrelevant, though hopefully correct.

Interestingly, the Java standard library moved away from making container types thread safe, because (as I understand it) the developer would generally use them as part of a larger custom data structure that would need its own locking (synchronization) anyway. Java now provides some separate containers explicitly for use with concurrency.

Likewise, in the C++ standard library, types only tend to have locking (for instance std::shared_ptr’s “partial internal synchronization“) when it wouldn’t be possible for the application code to do the locking itself (external synchronization). Presumably, Go programs then tend to have more locking that necessary, and presumably this is Go again choosing convenience and safety over raw performance.

However, I would really like to see a concurrency-safe queue in C++, similar to a Go channel. A lock-free concurrent queue seems like exactly the kind of difficult-to-implement, but easy to use, and broadly useful, thing that belongs in the standard library.

No thread-local storage

Because Goroutines don’t map directly to real threads, Go doesn’t try to provide “thread-local storage”. C++ has this, via its thread_local keyword, but I believe this is not useful, or wise, with C++ coroutines either.

Mutexes

Goroutines are a suitable way to co-operatively schedule work, and communication over channels can avoid the need for explicit locking, but you might still need to explicitly prevent concurrent access of shared data. Mutexes do this in both Go and C++, but feel free to skip to the summary if you only cared about Goroutines and channels.

Go’s sync.Mutex is like std::mutex

We lock/unlock Go’s Mutex, like so, though you can also do it manually instead of using defer.

m sync.Mutex

func something() {
  m.Lock()
  defer m.Unlock()

  // do something to the data.
}

which corresponds to this in C++ with std::mutex:

std::mutex m;

void something() {
  std::lock_guard<std::mutex> lock(our_mutex);

  // do something to the data.h
}

I’ve always wished there was some shorter syntax for that lock_guard in C++.

Note that you can use std::unique_lock, instead of std::lock_guard, if yo need to manually lock and unlock.

Go’s sync.RWMutex is like std::shared_mutex with std::shared_lock

A Read/Write mutex (or “shared exclusive lock”) allows multiple threads to read from shared data, while ensuring that they each see the latest state of that shared data, but only allows one thread at a time to write to the shared data.

We lock/unlock Go’s Mutex, like so, though you can also do it manually instead of using defer.

m sync.RWMutex

func somethingThatReads() {
  m.RLock()
  defer m.RUnlock()

  // get something from the data.
}

func somethingThatWrites() {
  m.Lock()
  defer m.Unlock()

  // do something to the data
}

which corresponds to this in C++ with std::shared_mutex:

std::shared_mutex m;

void somethingThatReads() {
  std::shared_lock<std::mutex> lock(our_mutex);

  // get something from the data
}

void somethingThatWrites() {
  std::lock_guard<std::mutex> lock(our_mutex);

  // do something to the data
}

I do find the Go names more obvious here. They seem to be aimed more at how people will actually use these locks in most situations.

No recursive/re-entrant mutex

Go mutexes, like C++’s s std::mutex and std::shared_mutex, are not re-entrant, so if a thread tries to lock a mutex that it has already locked, there will be a deadlock. C++ does provide a re-entrant mutex via std::recursive_mutex, but actually using it is generally considered rather disgusting and a sign that the code should be restructured to avoid it.

As the Go book wisely says. “The purpose of a mutex is to ensure that certain invariants of the shared variables are maintained at critical points during program execution. One of the invariants is “no goroutine is accessing the shared variables,” but there may be additional invariants specific to the data structures that the mutex guards. When a goroutine acquires a mutex lock, it may assume that the invariants hold. While it holds the lock, it may update the shared variables so that the invariants are temporarily violated. However, when it releases the lock, it must guarantee that order has been restored and the invariants hold once again. Although a re-entrant mutex would ensure that no other goroutines are accessing the shared variables, it cannot protect the additional invariants of those variables.”

Summary

Go’s concurrency features are another example of its choosing safety and simplicity over potential raw performance, in an area where standard C++ does not yet offer such simplicity even as an option.

Both support standard mutex locking, in much the same way. But otherwise their strengths go in opposite directions:

C++ lets you write lock-free concurrent code so the compiler, with the help of some hardware features, can arrange for the compiled code to never attempt simultaneous changing of your shared data, while ensuring that different threads see changes when necessary. But that is notoriously difficult to get right. Nevertheless it needs to be available so we can benefit from the work of people who know how to use it.

On the other hand, Go lets you write concurrency code that is easier to get right than when just using standard locks There is probably some small performance cost, but it’s likely to actually work and you’ll probably understand how it works. After all, when concurrency code is wrong (C++ makes it easier to be wrong), performance can suffer as well as safety.

I do have some hope that things will get better for C++ when coroutines arrive and are widely understood, but it looks like they alone might not be enough to match Go’s goroutines ad channels.

Kids Coding: Simple Java

Moving Beyond Blocks

I recently got my son (9) started on “real” programming with text files and the command line. Here are some notes about that. It’s still early in this experiment so I can’t claim that it’s a great idea, but its kind of working so far. There are so many web sites and product to get kids started in programming, but very little to move them past the basics into something more expansive. They risk getting obsessed but not then staying obsessed.

I learned programming because that is what you did with computers when I was a child. I kept learning programming because I could make the computer do new and interesting things. The problem now is that computers do amazing things without any programming and it requires massive experience and effort to achieve equally impressive results. So we must create a simpler environment which lets the child feel good about creating and understanding something more straightforward. It should be a path to real programming, not another island in a chain.

My son still hadn’t done much real text-based programming other than some playing with processing.org inside the IDE (which I also recommend) and some Python in CodeCombat. He had previously played a bit with Scratch and Lego Wedo, Hour of Code, Open Roberta (really good if you have the Lego Mindstorms education kit) and Tynker. He’s also done some Lego Mindstorms, but I think that’s an awful programming experience, at least with the non-education kit.

I’ve chosen Java, reluctantly, because I am familiar with it, it’s forgiving and convenient enough, and its language constructs (for loops, functions, etc) are enough like other C-like languages that the experience should be generally transferable. It’s far from ideal, but processing.org helps a little.

I love C++, but it’s clearly not suitable as a first text-based programming language. Neither is C, though I can imagine us trying that sometime. I think Python and Go are worth considering, but Python’s loose type system frustrates me personally and Go doesn’t seem quite mainstream enough.

However, if there is a simple language that you enjoy, whatever it is, I suggest that you use it with your kids. You must have enthusiasm to share and you must help them move past frustrations. If solving problems in the language doesn’t feel interesting to you then it probably won’t seem interesting to them. The overall aim here is to share your own love of programming by rediscovering how you learned programming yourself.

Using the Command Line

First I introduced the Linux command line, presenting it as where the real power is, where you can see and control what’s happening under everything else. (Yes, I know. Nevermind.) I kept is simple, just talking about directories and editing text files. We only covered:

  • Opening the terminal.
  • Seeing where you are with pwd.
  • Seeing what is there with ls.
  • Moving into a directory with cd.
  • Moving back with “cd ..”.
  • Opening a text file with gedit, from the terminal.
  • Using & so we can continue using the terminal after opening the file in gedit.
  • Running our program from the terminal.

Hiding the Boiler Plate

I wanted to mimic the experience of programming in simpler times, or of simple scripting today. At least at first, I wanted us to just write commands into a text file and just run the resulting program, without worrying about main functions or having a parent Java class, without  separate compilation and execution steps, and just using a Println() function instead of System.out.Println(). This let us avoid discussion of functions or classes until later.

I put this in my .bash_aliases file:

processing-run() {
    $HOME/processing/processing-java --sketch="$1" --run
}

so we can just run a program like this:

$ processing-run add

which runs the program in add/add.pde.

The .pde file just contains Java code, but not in any function and not part of any class (processing.org’s “static mode”). However, as soon as you want to implement your own function, you’ll have to put the rest of your code in a void setup() method (processing.org’s “active mode”), still not part of any class.

Using processing.org for this is a bit awkward, so I’m open to more suitable Java-like environments.

Doing Something Useful But Simple

I wanted to gradually explore simple programming techniques such as variables and loops, so I needed something that would just manipulate inputs and provide useful outputs.

I’ve chosen to gradually develop little programs based on the arithmetic techniques kids often learn in school. For instance, they learn an algorithm to add large numbers together using a carry – they just don’t call it an algorithm. They also learn long multiplication and long division, which add layers of complexity. As a bonus, this can demystify the magic that school requires them to repeat.

For instance, this commit history shows how we gradually wrote and improved a little program to add numbers (as strings). It’s missing the really basic code we started with. I used it to introducing new programming ideas along the way, such as:

  • Using variables.
  • Writing strings with quote marks.
  • Looping over the characters in a string (and backwards), particularly with the awkward for loop syntax.
  • Text output (standard out)
  • Discovering the length of a string instead of hard-coding it, to make the code more general.
  • ASCII Character codes. (Ignoring full Unicode for now.)
  • Getting a number (0, 1, 2, …) corresponding to a character (‘0’, ‘1’, ‘2’, …). Talking about types.
  • Using variables and manipulating them. For instance, adding the two numbers for two digits from two “numbers in strings”.
  • Variables in the loop and variables outside the loop. Some variables’ values are forgotten at the end of each time the loop is run, and some values are kept and used the next time.
  • Conditionals with if.
  • Putting code into functions and calling it.
  • Making functions take parameters, to make them more generally useful.
  • Calling functions from functions.
  • Passing an array instead of several parameters of the same type, to make the function more general.

You soon end up with a bunch of code that is big enough to seem overwhelming to a child. So every now and then I printed the code out so we could discuss it together. We talked about ideas such as variables, and setting variables, about functions, function parameters, and how to call functions, marking places in the code code with colored pencils.

Version Control

Programming is an iterative, experimental, activity. So I think version control should be introduced early as long as you can keep it simple.

We wrote the code at the same time on our own computers while sat next to each other. I checked in the changes to GitHub because they don’t allow users younger than 13. Seeing the changes, and their descriptions, should help kids remember the learning experience and remind them where they are on the path they’ve takhen. But it’s a real shame that kids can’t really own their coding history on GitHub, even privately.

A C++ developer looks at Go (the programming language), Part 2: Modularity and Object Orientation

This is the second part of a series. Here are all 3 parts:

In part 1, I looked at the simpler features of Go, such as its general syntax, its basic type system, and its for loop. Here I mention its support for packages and object orientation. As before, I strongly suggest that you read the book to learn about this stuff properly. Again, I welcome any friendly corrections and clarifications.

Overall, I find Go’s syntax for object orientation a bit messy, inconsistent and frequently too implicit, and for most uses I prefer the obviousness of C++’s inheritance hierarchy. But after writing the explanations down, I think I understand it.

I’m purposefully trying not to mention the build system, distribution, or configuration, for now.

Packages

Go code is organized in packages, which are much like Java package names, and a bit like C++ namespaces. The package name is declared at the start of each file:

package foo

And files in other packages should import them like so:

package bar

import (
  "foo"
  "moo"
)

func somefunc() {
  foo.Yadda()
  var a moo.Thing
  ...
}

The package name should match the file’s directory name. This is how the import statements find the packages’ files. You can have multiple files in the same directory that are all part of the same package.

Your “main” package, with your main function, is an exception to this rule. Its directory doesn’t need to be named “main” because you won’t be importing the main package from anywhere.

Go doesn’t seem to allow nested packages, unlike C++ (foo::thing::Yadda) and Java (foo.thing.Yadda), though people seem to work around this by creating separate libraries, which seems awkward.

I don’t know how people should specify the API of a Go package without providing the implementation. C and C++ have header files for this purpose, separating declaration and implementation.

Structs

You can declare structs in go much as in C. For instance:

type Thing struct {
  // Member fields.
  // Notice the lack of the var keyword.
  a int
  B int // See below about symbol visibility
}

var foo Thing
foo.B = 3

var bar Thing = Thing{3}

var goo *Thing = new(Thing)
goo.B = 5

As usual, I have used the var form to demonstrate the actual type, but you would probably want to use the short := form.

Notice that we can create it as a value or as a pointer (using the built-in new() function), though in Go, unlike C or C++, this does not determine whether its actual memory will be on the stack or the heap. The compiler decides, generally based on whether the memory needs to outlive the function call.

Previously we’ve seen the built-in make() function used to instantiate slices and maps (and we’ll see it in part 3 with channels). make() is only for those built-in types. For our own types, we can use the new() function. I find the distinction a bit messy, but I generally dislike the whole distinction between built-in types and types that can be implemented using the language itself. I like how the C++ standard library is implemented in C++, with very little special support from the language itself when something is added to the library.

Go types often have “constructor” functions (not methods) which you should call to properly instantiate the type, but I don’t think there is any way to enforce correct initialization like a default constructor in C++ or Java. For instance:


type Thing struct {
  a int
  name string
  ...
}

func NewThing() *Thing {
  // 100 is a suitable default value for a in this type:
  f := Thing{100, nil}
  return &f
}

// Notice that different "constructors" must have different names,
// because go doesn't have function or method overloading.
func NewThingWithName(name string) *Thing {
  f := Thing{100, name}
  return &f
}

Embedding Structs

You can anonymously “embed” one struct within an other, like so:

type Person struct {
   Name string
}

type Employee struct {
  Person
  Position string
}

var a Employee
a.Name = "bob"
a.Position = "builder"

This feels a bit like inheritance in C++ and Java, but this is just containment. It doesn’t give us any real “is a” meaning and doesn’t give us real polymorphism. For instance, you can do this:

var e = new(Employee)

// Compilation error.
var p *Person = e

// This works instead.
// So if we thought of this as a cast (we probably shouldn't),
// this would mean that we have to explicitly cast to the base class.
var p *Person = e.Person

// This works.
e.methodOnPerson()

// And this works.
// Name is a field in the contained Person struct.
e.Name = 2

// These work too, but the extra qualification is unnecessary.
e.Person.methodOnPerson()

Interfaces, which we’ll see later, do give us some sense of an “is a” meaning.

Methods

Unlike C, but like C++ and Java classes, structs in Go can have methods – functions associated with the struct. But the syntax is a little different than in C++ or Java. Methods are declared outside of the struct declaration, and the association is made by specifying a “receiver” before the function name. For instance, this declares (and implements) a DoSomething method for the Thing struct:

func (t Thing) DoSomething() {
  ...
}

Notice that you have to specify a name for the receiver – there is no built-in “self” or “this” instance name. This feels like an unnecessary invitation to inconsistency.

You can use a pointer type instead, and you’ll have to if you want to change anything about the struct instance:

func (t *Thing) ChangeSomething() {
  t.a = 4
}

Because you should also want to keep your code consistent, you’d therefore want to use a pointer type for all method receivers. So I don’t know why the language lets it ever be a struct value type.

Unlike C++ or Java, this lets you check the instance for nil (Go’s null or nullptr), making it acceptable to call your method on a null instance. This reminds me of how Objective-C happily lets you call a method (“send a message to” in Objective-C terminology) on a nil instance, with no crash, even returning a nil or zero return value. I find that undisciplined in Objective-C, and it bothers me that Go allows this sometimes, but not consistently.

Unlike C++ or Java, you can even associate methods with non struct (non class) types. For instance:

type Meters int
type Feet int

func (Meters) convertToFeet() (Feet) {
  ...
}

Meters m = 10
f := p.convertToFeet()

No equality or comparison operator overloading

In C++, you can overload operator =, !=, <, >, etc, so you can use instances of your type with the regular operators, making your code look tidy:

MyType a = getSomething();
MyType b = getSomethingElse();
if (a == b) {
  ...
}

You can’t do that in Go (or Java, though it has the awkward Comparable interface and equals() method). Only some built-in types are comparable in Go – the numeric types, string, pointers, or channels, or structs or arrays made up of these types. This is an issue when dealing with interfaces, which we’ll see later.

Symbol Visibility: Uppercase or lowercase first letter

Symbols (types, functions, variables) that start with an uppercase letter are available from outside the package. Struct methods and member variables that start with an uppercase letter are available from outside the struct. Otherwise they are private to the package or struct.

For instance:

type Thing int // This type will be available outside of the package.
var Thingleton Thing// This variable will be available outside of the package.

type thing int // Not available outside of the package.
var thing1 thing // Not available outside of the package.
var thing2 Thing // Not available outside of the package.

// Available outside of the package.
func DoThing() {
  ...
}

// Not available outside of the package.
func doThing() {
  ...
}

type Stuff struct {
  Thing1 Thing // Available outside of the package.
  thing2 Thing // "private" to the struct.
}

// Available outside of the struct.
func (s Stuff) Foo() {
  ...
}

// Not available outside of the struct.
func (s Stuff) bar() {
  ...
}

// Not available outside of the package.
type localstuff struct {
...
}

I find this a bit strange. I prefer the explicit public and private keywords in C++ and Java.

Interfaces

Interfaces have methods

If two Go types satisfy an interface then they both have the methods of that interface. This is similar to Java interfaces. A Go interface is also a bit like a completely abstract class in C++ (having only pure virtual methods), but it’s also a lot like a C++ concept (not yet in C++, as of C++17). For instance:

type Shape interface {
  // The interface's methods.
  // Note the lack of the func keyword.
  SetPosition(x int, y int)
  GetPosition() (x int, y int)
  DrawOnSurface(s Surface)
}

type Rectangle struct {
  ...
}

// Methods to satisfy the Shape interface.
func (r *Rectangle) SetPosition(x int, y int) {
  ...
}

func (r *Rectangle) GetPosition() (x int, y int) {
  ...
}
func (r *Rectangle) DrawOnSurface(s Surface) {
   ...
}

// Other methods:
func (r *Rectangle) setCornerType(c CornerType) {
   ...
}
func (r *Rectangle) cornerType() (CornerType) {
   ...
}

type Circle struct {
  ...
}

// Methods to satisfy the Shape interface.
func (c *Circle) SetPosition(x int, y int) {
  ...
}

func (c *Circle) GetPosition() (x int, y int) {
  ...
}

func (c *Circle) DrawOnSurface(s Surface) {
  ...
}

// Other methods:
...

You can then use the interface type instead of the specific “concrete” type:

var someCircle *Circle = new(Circle)
var s Shape = someCircle
s.DrawOnSurface(someSurface)

Notice that we use a Shape, not a *Shape (pointer to Shape), even though we are casting from a *Circle (pointer to circle). “Interface values” seem to be implicitly pointer-like, which seems unnecessarily confusing. I guess it would feel more consistent if pointers to interfaces just had the same behaviour as these “interface values”, even if the language had to disallow interface types that weren’t pointers.

Types satisfy interfaces implicitly

However, there is no explicit declaration that a type should implement an interface.

In this way Go interfaces are like C++ concepts, though C++ concepts are instead a purely compile-time feature for use with generic (template) code. Your class can conform to a C++ concept without you declaring that it does. And therefore, like Go interfaces, you can, if you must, use an existing type without changing it.

The compiler still checks that types are compatible, but presumably by checking the types’ list of methods rather than checking a class hierarchy or list of implemented interfaces. For instance:

var a *Circle = new(Circle)
var b Shape = a // OK. The compiler can check that Circle has Shape's methods.

Like C++ with dynamic_cast, Go can also check at runtime. For instance, you can check if one interface value refers to an instance that also satisfies another interface:

// Sometimes the Shape (our interface type) is also a Drawable
// (another interface type), sometimes not.
var a Shape = Something.GetShape()

// Notice that we want to cast to a Drawable, not a *Drawable,
// because Drawable is an interface.
var b = a.(Drawable) // Panic (crash) if this fails.

var b, ok = a.(Drawable) // No panic.
if ok {
  b.DrawOnSurface(someSurface)
}

Or we can check that an interface value refers to a particular concrete type. For instance:

// Get Shape() returns an interface value.
// Shape is our interface.
var a Shape = Something.GetShape()

// Notice that we want to cast to a *Thing, not a Thing,
// because Thing is a concrete type, not an interface.
var b = a.(*Thing) // Panic (crash) if this fails.

var b, ok = a.(*Thing) // No panic.
if ok {
  b.DoSomething()
}

Runtime dispatch

Interface methods are also like C++ virtual methods (or any Java method), and interface variables are also like instances of polymorphic base classes. To actually call the interface’s method via an interface variable, the program needs to examine its actual type at runtime and call that type’s specific method. Maybe, as with C++, the compiler can sometimes optimize away that indirection.

This is obviously not as efficient as directly calling a method, identified at compile time, of a templated type in a C++ template. But it is obviously much simpler.

Comparing interfaces

Interface values can be compared sometimes, but this seems like a risky business. Interface values are:

  • Not equal if their types are different.
  • Not equal if their types are the same and only one is nil.
  • Equal if their types are the same, and the types are comparable (see above), and their values are equal.

But if the types are the same, yet those types are not comparable, Go will cause a “panic” at runtime.

Wishing for an implements keyword

In C++ you can, if you wish, explicitly declare that a class should conform to the concept, or you can explicitly derive from a base class, and in Java you must use the “implements” keyword. Not having this with Go would take some getting used to. I’d want these declarations to document my architecture, explicitly showing what’s expected of my”concrete” classes in terms of their general purpose instead of just expressing their that by how some other code happens to use them. Not having this feels fragile.

The book suggests putting this awkward code somewhere to check that a type really implements an interface. Note the use of _ to mean that we don’t need to keep a named variable for the result.

var _ MyInterface = (*MyType)(nil)

The compiler should complain that the conversion is impossible if the type does not satisfy the interface. I think it would be wise this as the very minimum of testing, particularly if your package is providing types that are not really used in the package itself. For me, this is a poor substitute for an obvious compile-time check, using a specific language construct, on the type itself.

Interface embedding

Embedding an interface in an interface

Go has no notion of inheritance hierarchies, but you can “embed” one interface in another, to indicate that a class that satisfies one interface also satisfies the other. For instance:

type Positionable interface {
  SetPosition(x int, y int)
  GetPosition() (x int, y int)
}

type Drawable interface {
  drawOnSurface(s Surface) }
}

type Shape interface {
  Positionable
  Drawable
}

To satisfy the Shape interface, any type must also satisfy the Drawable and Positionable interfaces. Therefore, any type that satisfies the Shape interface can be used with a method associated with the Drawable or Positionable interfaces. So it’s a bit like a Java interface extending another interface.

Embedding an interface-satisfying struct in a struct

We saw earlier how you can embed one struct in another anonymously. If the contained struct implements an interface, then the containing struct then also implements that interface, with no need for manually-implemented forwarding methods. For instance:

type Drawable interface {
 drawOnSurface(s Surface)
}

type Painter struct {
  ...
}

// Make Painter satisfy the Drawable interface.
func (p *Painter) drawOnSurface(s Surface) {
  ...
}

type Circle struct {
 // Make Circle satisfy the Drawable interface via Painter.
 Painter
 ...
}

func main() {
  ...
  var c *Circle = new(Circle)
 
  // This is OK.
  // Circle satisfies Drawable, via Painter
  c.drawOnSurface(someSurface)

  // This is also OK.
  // Circle can be used as an interface value of type Drawable, via Painter.
  var d Drawable = c
  d.drawOnSurface(someSurface)
}

This again feels a bit like inheritance.

I actually quite like how the (interfaces of the) anonymously contained structs affect the interface of the parent struct, even with Go’s curious interface system, though I wish the syntax was more obvious about what is happening. It might be nice to have something similar in C++. Encapsulation instead of inheritance (and the Decorator pattern) is a perfectly valid technique, and C++ generally tries to let you do things in multiple ways without having an opinion about what’s best, though that can itself be a source of complexity. But in C++ (and Java), you currently have to hand-code lots of forwarding methods to achieve this and you still need to inherit from something to tell the type system that you support the encapsulated type’s interface.

 

A C++ developer looks at Go (the programming language), Part 1: Simple Features

This is the first part of a series. Here are all 3 parts:

I’m reading “The Go Programming Language” by Brian Kernighan and Alan Donovan. It is a perfect programming language introduction, clearly written and perfectly structured, with nicely chosen examples. It contains no hand-waving – it’s aware of other languages and briefly acknowledges the choices made in the language design without lengthy discussion.

As an enthusiastic C++ developer, and a Java developer, I’m not a big fan of the overall language. It seems like an incremental improvement on C, and I’d rather use it than C, but I still yearn for the expressiveness of C++. I also suspect that Go cannot achieve the raw performance of C or C++ due to its safety features, though that maybe depends on compiler optimization. But it’s perfectly valid to knowingly choose safety over performance, particularly if you get more safety and more performance than with Java.

I would choose Go over C++ for a simple proof of concept program using concurrency and networking. Goroutines and channels, which I’ll mention in a later post, are convenient abstractions, and Go has standard API for HTTP requests. Concurrency is hard, and it’s particularly easy to choose safety over performance when writing network code.

Here are some of my superficial observations about the simpler features, which mostly seem like straightforward improvements on C. In part 2 I’ll mention the higher-level features and I’ll hopefully do a part 3 about concurrency. I strongly recommend that you read the book to understand these issues properly.

I welcome friendly corrections and clarifications. There are surely several mistakes here, hopefully none major.

No semicolons at the end of lines

Let’s start with the most superficial thing. Unlike C, C++, or Java, Go doesn’t need semicolons at the end of lines of code. So this is normal:

a = b
c = d

This is nicer for people learning their first programming language. It can take a while for those semicolons to become a natural habit.

No () parentheses with if and for

Here’s another superficial difference. Unlike C or Java, Go doesn’t put its conditions inside parentheses with if and for. That’s another small change that feels arbitrary and makes C coders feel less comfortable.

For instance, in Go we might write this:

for i := 0; i < 100; i++ {
  ...
}

if a == 2 {
  ...
}

Which in C would look like this:

for (int i = 0; i < 100; i++) {
  ...
}

if (a == 2) {
  ...
}

Type inference

Go has type inference, from literal values or from function return values, so you don’t need to restate types that the compiler should know about. This is a bit like C++’s auto keyword (since C++11). For instance:

var a = 1 // An int.
var b = 1.0 // A float64.
var c = getThing()

There’s also a := syntax that avoids the need for var, though I don’t see the need for both in the language:

a := 1 // An int.
b := 1.0 // A float64
d := getThing()

I love type inference via auto in modern C++, and find it really painful to use any language that doesn’t have this. Java feels increasingly verbose in comparison, but maybe Java will get there. I don’t see why C can’t have this. After all, they eventually allowed variables to be declared not just at the start of functions, so change is possible.

Types after names

Go has types after the variable/parameter/function names, which feels rather arbitrary, though I guess there are reasons, and personally I can adapt. So, in C you’d have

Foo foo = 2;

but in Go you’d have

var foo Foo = 2

Keeping a more C-like syntax would have eased C developers into the language. These are often not people who embrace even small changes in the language.

No implicit conversions

Go doesn’t have implicit conversions between types, such as int and uint, or floats and int. This also applies to comparison via == and !=.

So, these won’t compile:

var a int = -2
var b uint = a
var c int = b

var d float64 = 1.345
var e int = c

C compiler warnings can catch some of these, but a) People generally don’t turn on all these warnings, and they don’t turn on warnings as errors, and b) the warnings are not this strict.

Notice that Go has the type after the variable (or parameter, or function) name, not before.

Notice that, unlike Java, Go still has unsigned integers. Unlike C++’s standard library, Go uses signed integers for sizes and lengths. Hopefully C++ will get do that too one day.

No implicit conversions because of underlying types

Go doesn’t even allow implicit conversions between types that, in C, would just be typedefs. So, this won’t compile

type Meters int
type Feet int
var a Meters = 100
var b Feet = a

I think I’d like to see this as a warning in C and C++ compilers when using typedef.

However, you are allowed to implicitly assign a literal (untyped) value, which looks like the underlying type, to a typed variable, but you can’t assign from an actual typed variable of the underlying type:

type Meters int
var a Meters = 100 // No problem.

var i int = 100
var b Meters = i // Will not compile.

No enums

Go has no enums. You should instead use const values with the iota keyword. So, while C++ code might have this:

enum class Continent {
  NORTH_AMERICA,
  SOUTH_AMERICA,
  EUROPE,
  AFRICA,
  ...
};

Continent c = Continent::EUROPE;
Continent d = 2; // Will not compile

in Go, you’d have this:

type continent int

const (
  CONTINENT_NORTH_AMERICA continent = iota
  CONTINENT_SOUTH_AMERICA // Also a continent, with the next value via iota.
  CONTINENT_EUROPE // Also a continent, with the next value via iota.
  CONTINENT_AFRICA // Also a continent, with the next value via iota.
)

var c continent = CONTINENT_EUROPE
var d continent = 2 // But this works too.

Notice how, compared to C++ enums, particularly C++11 scoped enums, each value’s name must have an explicit prefix, and the compiler won’t stop you from assigning a literal number to a variable of the enum type. Also, the Go compiler doesn’t treat these as a group of associated values, so it can’t warn you, for instance, if you forget to mention one in a switch/case block.

Switch/Case: No fallthrough by default

In C and C++, you almost always need a break statement at the end of each case block. Otherwise, the code in the following case block will run too. This can be useful, particularly when you want the same code to run in response to multiple values, but it’s not the common case. In Go, you have to add an explicit fallthrough keyword to get this behaviour, so the code is more concise in the general case.

Switch/Case: Not just basic types

In Go, unlike in C and C++, you can switch on any comparable value, not just values known at compile time, such as ints,  enums, or other constexpr values. So you can switch on strings, for instance:

switch str {
  case "foo":
  doFoo()
case "bar":
  doBar()
}

This is convenient and I guess that it is still compiled to efficient machine code when it uses compile-time values. C++ seems to have resisted this convenience because it couldn’t always be as efficient as a standard switch/case, but I think that unnecessarily ties the switch/case syntax to its original meaning in C when people expected to be more aware of the mapping from C code to machine code.

Pointers, but no ->, and no pointer arithmetic

Go has normal types and pointer types, and uses * and & as in C and C++. For instance:

var a thing = getThing();
var p *thing = &a;
var b thing = *p; // Copy a by value, via the p pointer

As in C++, the new keyword returns a pointer to a new instance:

var a *thing = new(thing)
var a thing = new(thing) // Compilation error

This is like C++, but unlike Java, in which any non-fundamental types (not ints or booleans, for instance) are effectively used via a reference (it just looks like a value), which can confuse people at first by allowing inadvertent sharing.

Unlike C++, you can call a method on a value or a pointer using the same dot operator:

var a *Thing = new(Thing) // You wouldn't normally specify the type.
var b Thing = *a
a.foo();
b.foo();

I like this. After all, the compiler knows whether the type is a pointer or a value, so why should it bother me with complaints about a . where there should be a -> or vice-versa? However, along with type inference, this can slightly obscure whether your code is dealing with a pointer (maybe sharing the value with other code) or a value. I’d like to see this in C++, though it would be awkward with smart pointers.

You cannot do pointer arithmetic in Go. For instance, if you have an array, you can’t step through that array by repeatedly adding 1 to a pointer value and dereferencing it. You have to access the array elements by index, which I think involves bounds checking. This avoids some mistakes that can happen in C and C++ code, leading to security vulnerabilities when your code accesses unexpected parts of your application’s memory.

Go functions can take parameters by value or by pointer. This is like C++, but unlike Java, which always takes non-fundamental types by (non const) reference, though it can look to beginner programmers as if they are being copied by value. I’d rather have both options with the code showing clearly what is happening via the function signature, as in C++ or Go.

Like Java, Go has no notion of const pointers or const references. So if your function takes a parameter as a pointer, for efficiency, your compiler can’t stop you from changing the value that it points to. In Java, this is often done by creating an immutable type, and many Java types, such as String, are immutable, so you can’t change them even if you want to. But I prefer language support for constness as in C++, for pointer/reference parameters and for values initialized at runtime. Which leads us to const in Go.

References, sometimes

Go does seem to have references (roughly, pointers that look like values), but only for the built-in slice, map, and channel types.  (See below about slices and maps.) So, for instance, this function can change its input slide parameter, and that change will be visible to the caller, even though the parameter is not declared as a pointer:

func doThing(someSlice []int) {
  someSlice[2] = 3;
}

In C++, this would be more obviously a reference:

void doThing(Thing& someSlice) {
  someSlice[2] = 3;
}

I’m not sure if this is a fundamental feature of the language or just something about how those types are implemented. It seems confusing for just some types to act differently, and I find the explanation a bit hand-wavy. Convenience is nice, but so is consistency.

const

Go’s const keyword is not like const in C (rarely useful) or C++, where it indicates that a variable’s value should not be changed after initialization. It is more like C++’s constexpr keyword (since C++11), which defines values at compile time. So it’s a bit like a replacement for macros via #define in C, but with type safety. For instance:

const pi = 3.14

Notice that we don’t specify a type for the const value, so the value can be used with various types depending on the syntax of the value, a bit like a C macro #define. But we can restrict it by specifying a type:

const pi float64 = 3.14

Unlike constexpr in C++, there is no concept of constexpr functions or types that can be evaluated at compile time, so you can’t do this:

const pi = calculate_pi()

and you can’t do this

type Point struct {
  X int
  Y int
}

const point = Point{1, 2}

though you can do this with a simple type whose underlying type can be const:

type Yards int
const length Yards = 100

Only for loops

All loops in Go are for loops – there are no while or do-while loops. This simplifies the language in one way compared, for instance, to C, C++, or Java, though there are now multiple forms of for loop.

For instance:

for i := 0; i < 100; i++ {
  ...
}

or, like a while loop in C:

for keepGoing {
  ...
}

And for loops have a range-based syntax for containers such as string, slices or maps, which I’ll mention later:

for i, c := range things {
  ...
}

C++ has range-based for loops too, since C++11, but I like that Go can (optionally) give you the index as well as the value. (It gives you the index, or the index and the value, letting you ignore the index with the _ variable name.)

A native (Unicode) String type

Go has a built-in string type, and built in comparison operators such as ==, !=, and < (as does Java). Like Java, Strings are immutable, so you can’t change them after you’ve created them, though you can create new Strings by concatenating other Strings with the built in operator +. For instance:

str1 := "foo"
str2 := str1 + "bar"

Go source code is always UTF-8 encoded and string literals may contain non-ASCII utf-8 code points. Go calls Unicode code points “runes”.

Although the built-in len() function returns the number of bytes, and the built in operator [] for strings operates on bytes, there is a utf8 package for dealing with strings as runes (Unicode code points). For instance:

str := "foo"
l := utf8.RuneCountInString(str)

And the range-based for loop deals in runes, not bytes:

str := "foo"
for _, r := range str {
  fmt.Println("rune: %q", r)
}

C++ still has no standard equivalent.

Slices

Go’s slices are a bit like dynamically-allocated arrays in C, though they are really views of an underlying array, and two slices can be views into different parts of the same underlying array. They feel a bit like std::string_view from C++17, or GSL::span, but they can be resized easily, like std::vector in C++17 or ArrayList in Java.

We can declare a span like so, and append to it:

a := []int{5, 4, 3, 2, 1} // A slice
a = append(a, 0)

Arrays (whose size cannot change, unlike slices) have a very similar syntax:

a := [...]int{5, 4, 3, 2, 1} // An array.
b := [5]int{5, 4, 3, 2, 1} // Another array.

You must be careful to pass arrays to functions by pointer, or they will be (deep) copied by value.

Slices are not (deep) comparable, or copyable, unlike std::array or std::vector in C++, which feels rather inconvenient.

Slices don’t grow beyond their capacity (which can be more than their current length) when you append values. To do that you must manually create a new slice and copy the old slice’s elements into it. You can keep a pointer to an element in a slice (really to the element in the underlying array). So, as with maps (below), the lack of resizing is probably to remove any possibility of an pointer becoming invalid.

The built in append() function may allocate a bigger underlying array if it would need more than the existing capacity (which can be more than the current length). So you should always assign the result of append() like so:

a = append(a, 123)

I don’t think you can keep a pointer to an element in a slice. If you could, the garbage collection system would need to keep the previous underlying array around until you had stopped using that pointer.

Unlike C or C++ arrays, and unlike operator [] with std::vector, attempting to access an invalid index of a slice will result in a panic (effectively a crash) rather than just undefined behaviour. I prefer this, though I imagine that the bounds checking has some small performance cost.

Maps

Go has a built-in map type. This is roughly equivalent to C++’s std::map (balanced binary trees), or std::unordered_map (hash tables). Go maps are apparently hash tables but I don’t know if they are separate-chaining hash tables (like std::unordered_map) or open-addressing hash tables (like nothing in standard C++ yet, unfortunately).

Obviously, keys in hash tables have to be hashable and comparable. The book mentions comparability, but so few things are comparable that they would all be easily hashable too. Only basic types (int, float64, string, etc, but not slices) or structs made up only of basic types are comparable, so that’s all you can use as a key. You can get around this by using a basic type (such as an int or string) that is (or can be made into) a hash of your value. I prefer C++’s need for a std::hash<> specialization, though I wish it was easier to write one.

Unlike C++’, you can’t keep a pointer to an element in a map, so changing one part of the value means copying the whole value back into the map, presumably with another lookup. Go apparently does this to completely avoid the problem of invalid pointers when the map has to grow. C++ instead lets you take the risk, specifying when your pointer could become invalid.

Go maps are clearly a big advantage over C, where you otherwise have to use some third-party data structure or write your own, typically with very little type safety.

They look like this:

m := make(map[int]string)
m[3] = "three"
m[4] = "four"

Multiple return values

Functions in Go can have multiple return types, which I find more obvious then output parameters. For instance:

func getThings() (int, Foo) {
  return 2, getFoo()
}

a, b := getThings()

This is a bit like returning tuples in modern C++, particularly with structured bindings in C++17:

std::tuple<int, Foo> get_things() {
  return make_tuple(2, get_foo());
}

auto [i, f] = get_things();

Garbage Collection

Like Java, Go has automatic memory management, so you can trust that instances will not be released until you have finished using them, and you don’t need to explicitly release them. So you can happily do this, without worrying about releasing the instance later:

func getThing() *Thing {
  a := new(Thing)
  ...
  return a
}

b := getThing()
b.foo()

And you can even do this, not caring, and not easily even knowing, whether the instance was created on the stack or the heap:

func getThing() *Thing {
  var a Thing
  ...
  return &a
}

b := getThing()
b.foo()

I don’t know how Go avoids circular references or unwanted “leak” references, as Java or C++ would with weak references.

I wonder how, or if, Go avoids Java’s problem with intermittent slowdowns due to garbage collection. Go seems to be aimed at system-level code, so I guess it must do better somehow.

However, also like Java, and probably like all garbage collection, this is only useful for managing memory, not resources in general. The programmer is usually happy to have memory released some time after the code has finished using it, not necessarily immediately. But other resources, such as file descriptors and database connections, need to be released immediately. Some things, such as mutex locks, often need to be released at the end of an obvious scope. Destructors make this possible. For instance, in C++:

void Something::do_something() {
  do_something_harmless();

  {
    std::lock_guard<std::mutex> lock(our_mutex);
    change_some_shared_state();
  }
  
  do_something_else_harmless();
}

Go can’t do this, so it has defer() instead, letting you specify something to happen whenever a function ends. It’s a annoying that defer is associated with functions, not to scopes in general.

func something() {
  doSomethingHarmless()

  ourMutex.Lock()
  defer ourMutex.Unlock()
  changeSomeSharedState()

  // The mutex has not been released yet when this remaining code runs,
  // so you'd want to restrict the use of the resource (a mutex here) to
  // another small function, and just call it in this function.
  doSomethingElseHarmless()
}

This feels like an awkward hack, like Java’s try-with-resources.

I would prefer to see a language that somehow gives me all of scoped resource management (with destructors), reference-counting (like std::shared_ptr<>) and garbage collection, in a concise syntax, so I can have predictable, obvious, but reliable, resource releasing when necessary, and garbage collection when I don’t care.

Of course, I’m not pretending that memory management is easy in C++. When it’s difficult it can be very difficult. So I do understand the choice of garbage collection. I just expect a system level language to offer more.

Things I don’t like in Go

As well as the minor syntactic annoyances mentioned above, and the lack of simple generic resource (not just memory) management, I have a couple of other frustrations with the language.

(I’m not loving the support for object orientation either, but I’ll mention that in a later article when I’ve studied it more.)

No generics

Go’s focus on type safety, particularly for numeric types, makes the lack of generics surprising. I can remember how frustrating it was to use Java before generics, and this feels almost that awkward. Without generics I soon find myself having to choose between lack of type safety or repeatedly reimplementing code for each type, feeling like I’m fighting the language.

I understand that generics are difficult to implement, and they’d have to make a choice about how far to take them (probably further than Java, but not as far as C++), and I understand that Go would then be much more than a better C. But I think generics are inevitable once, like Go, you pursue static type safety.

Somehow go’s slice and map containers are generic, probably because they are built-in types.

Lack of standard containers

Go has no queue or stack in its standard library. In C++, I use std::queue and std::stack regularly. I think these would need generics. People can use go’s slice (a dynamically-allocated array) to achieve the same things, and you can wrap that up in your own type, but your type, can only contain specific types, so you’ll be reimplementing this for every type. Or your container can hold interface{} types (apparently a bit like a Java Object or a C++ void*), giving up (static) type safety.

BigOQuiz: C++ Standard Library Algorithms

I’ve reworked bigoquiz.com, which I created a few weeks ago. to allow multiple quizzes instead of just the default quiz about the Big-O complexity of algorithms and data structures. For some quizzes it also now offers reversed questions, asking you for the question that matches the answer.

For instance, there is now a C++ Standard Algorithms quiz. You could, for instance, check that you know the descriptions of the Sort Operations functions. Or you could check that you know the names of those Sort Operations functions based on their descriptions (or as multiple-choice). Maybe this is one way to get more familiar with the many standard C++ algorithms that we often don’t realize are available. I hope this helps me as I strive to use “no raw loops“.

As before, you can track your progress in each section of the quiz, and identify problem questions that you keep getting wrong.

screenshot_bigoquiz_cpp_std_algorithms_sections

I’ve also added a Design Patterns quiz, based on the patterns mentioned in the Head-First Design Patterns book.

The user profile page also now shows overall statistics for all the quizzes that you’ve tried.

screenshot_bigoquiz_userprofile

 

Big-O Quiz: An experiment with Google AppEngine

bigoquiz.com

screenshot_bigoquiz_webOver the last few weeks I’ve been working on bigoquiz.com. I wanted to learn the Big-O algorithmic complexity of the main algorithms and data structures by heart. So, of course, I first built a system to help me do that.

bigocheatsheet.com and bigoref.com already do a good job of listing the information, but bigoquiz.com lets you check that you really know each one, and shows you the ones you keep getting wrong.

Tech I learned along the way

I find it very hard to resist a project when I see that it would let me play with multiple new things. This one was a chance to learn about:

Details

I noticed that bigoref.com had some corrections and additions compared to bigcheatsheet.com. And I made some additions of my own. I’ve listed it all here for anybody who is interested:

For instance,  bigoref.com adds Graph Search algorithms such as Dijkstra’s and Bellman-Ford (missing from bigocheatsheet.com), under “Searching“. I instead listed the graph search algorithms in bigoquiz.com under “Graph Search”  and added more graph algorithms that I’d learned about recently in my Coursera course, though I’m tempted to create a completely different graph algorithms quiz: Kruskal’s Minimum Spanning Tree, Prim’s Minimum Spanning Tree, Floyd Warshall All Pairs Shortest Path, and Johnson’s All Pairs Shortest Path.

bigoref.com also adds linear and binary search of arrays (missing from bigocheatsheet.com). For bigoquiz.com I just split Array into sorted and unsorted in “Data Structure Operations”.

I found just one error on bigoquiz.com that was corrected on bigoref.com, though there might be others:

And I noticed that bigoref.com has some additional errors so I corrected those on bigoquiz.com and  filed github issues for bigoref:

Also:

  • bigocheatsheet.com has entries for Stack, but bigoref.com does not. I kept that in bigoquiz.com.
  • bigoref.com has entries for Ternary Search Tree, apparently a kind of Trie (Prefix Tree), but bigocheatsheet.com does not. I kept this in bigoquiz.com. Curiously, bigoref.com doesn’t have entries for regular Tries (Prefix Trees), or Suffix Trees. I added them to bigoquiz.com, though I’m not completely sure about the various best/worst/average time/space complexities.
  • bigoref.com uses the term “indexing” where bigocheatsheet.com uses “access”. I’ve stuck with “access” for bigoquiz.com.
  • bigoref.com splits array into Basic Array and Dynamic Array, not allowing insertion or deletion in a Basic Array. I don’t find the distinction useful, so I kept it as just Array in bigoquiz.com.

GWTP: Model/View/Presenter for GWT with GIN/GUICE dependency injection

I’ve building a new website project with GWT, so I thought I’d take the opportunity to try GWTP, which provides an MVP framework for GWT. I’ve previously used bare GWT to do much the same thing while developing OnlineGlom, but I hoped it could be easier. Both are really for structuring client-side UI and behaviour.

Once you’ve got your GWT application built, you can almost forget how difficult it was to create the basic structure. So I’ve taken the time to describe both the old (GWT) way and the new (GWTP) way of doing things to remind myself how much better life is with GWTP. I mostly just discuss how the activities/presenters and views interact in the two systems. You would need to learn a bit more to write a full application. If you are familiar with GWT, you might skip to the GWTP section.

You’ll see that, even with GWTP, the code is far from concise, but that’s hard to avoid completely with Java, particularly when trying to split code into Presenters and Views, and splitting implementation and interface. Overall, I really wish that GWTP was just part of GWT with the old GWT Activities deprecated. Then people could focus on this one right way of doing things, and making it even better.

For GWT, I refer to my old OnlineGlom project. For GWTP, I refer to the little murrayc-gwtp-appengine-example project I created recently.

GWT: Activities/Views/Places

GWT provides the Activities and Places framework, and suggests you create Views for your UI. That page has a good overview and explanation, but I’ll go over the parts here too.

Place

GWT’s Activities and Places framework is mixed up with GWT’s history support, which lets the different states of your application have URLs that can appear in browser bookmarks and history. That’s how we have a “history token”, which describes a Place.

For instance, for each main location in your app, you would have SomethingPlace, extending GWT’s Place. Here’s an example from OnlineGlom: DetailsPlace is for a page that shows the details from a database record.

The @Prefix(details) annotation on its Tokenizer tells GWT that this place can handle URLs that begin with/ #/details, such as http:://onlineglom.something.com/#/details?table=things?primaryKey=123. The Place class takes the place’s parameters in its constructor and stores them. Those parameters usually refer to the ?something=blah parameters in the URLs.

getPlace() constructs a Place object from a token (mostly a URL), and getToken() constructs a token from a Place object. There isn’t any convenience API to avoid the tedious splitting and concatenation of strings, though we created a couple of helper methods for this in OnlineGlom.

Activity

Each place in your app will have an Activity, whose constructor will take an instance of your derived Place class. For instance, in OnlineGlom, the DetailsActivity takes a DetailsPlace. The Activity can use the Place’s getter methods to discover what the URL’s parameters have specified.

Your app will derive an ActivityMapper that maps the places to the Activities. For instance, in OnlineGlom, our DataActivityMapper‘s getActivity() method instantiates a DetailsActivity when it gets a DetailsPlace.

You’ll pass your ActivityMapper when creating an ActivityManager. For instance, in OnlineGlom, we create an ActivityManager that takes our DataActivityMapper. You’ll then call setDisplay() on your ActivityManager, passing an AcceptsOneWidget that will be used to show your activitity’s View.

This is already a lot of boilerplate code for some basic functionality.

View

Each place has a View. You’ll have one view interface and at least one implementation of that interface. For instance, OnlineGlom has its DetailsView interface and its DetailsViewImpl class that implements that interface. Your Activity can call methods on the view to make it show appropriate UI for the Activity’s state.

GWT has no base View class or interface, though we created a base View class for OnlineGlom, which has a View.Presenter interface implemented by all the Activities. Each Activity can then call setPresenter(this) on the view, giving the view a way to respond back to the Activity, such as telling the Activity that the user has clicked on something that should take the user to a new location.

Your ClientFactoryImpl, which implements your ClientFactory interface, instantiates the view implementation based on the view interface. For instance, OnlineGlom’s ClientFactoryImpl creates a DetailsViewImpl when asked for a DetailsView.

Your Activity takes your ClientFactory and uses it to create its View. For instance, in OnlineGlom’s DetailsActivity constructor, which was itself called by the DetailsActivityManager that I mentioned earlier. The actual ClientFactoryImpl is instantiated by a call to GWT.create(ClientFactory.class), using the ClientFactory to ClientFactoryImpl mapping in a .gwt.xml file. For instance, in OnlineGlom.gwt.xml:

<!-- Use ClientFactoryImpl by default -->
<replace-with class="org.glom.web.client.ClientFactoryImpl">
  <when-type-is class="org.glom.web.client.ClientFactory" />
</replace-with>

In theory, you might want different implementations of your View interface for different devices. Maybe you’d have DetailsViewMobileImpl, for instance, but that’s not a technique that I’ve personally found useful. However, allowing multiple implementations of the View also allows testing of your Activity by mocking the View,  letting you test logic without having to test UI behaviour at the same time.

Again, you can see that you have to implement lots of repetitive boilerplate code and you might wonder why you need to bother.

GWTP: Presenters/Views/Places

GWTP uses Presenters rather than Activities, but they are much the same thing: code that tells your UI View what to do without telling the UI View how to do it. But GWTP requires less repetitive code, by using dependency injection with GIN (GUICE for GWT) to tie the pieces together.

If you are returning to Java after some time away then Dependency Injection is your Tirimasu. This Google I/O 2009: Big Modular Java with GUICE talk seems to be an excellent introduction to dependency injection in general, and dependency injection with GUICE.

For instance, this lets us add a parameter to a constructor, and as long as the constructor has the @Inject annotation, and as long as we’ve told GIN (or GUICE) what implementation types to use, GIN (or GUICE) can end up calling that constructor with an appropriate instance for the parameter.

Place

WIth GWTP, you don’t need to implement any actual Place classes. Instead, you just use the @NameToken annotation on your Presenter’s proxy (declared via the @ProxyStandard annotation). For instance, in murrayc-gwtp-appengine-example’s ThingPresenter:

@ProxyStandard
@NameToken(NameTokens.THING)
interface MyProxy extends ProxyPlace {
}

That NameTokens.THING is just a string constant that I’ve put together with the others.

Presenter

Your Presenter should derive from GWTP’s Presenter, which is a generic class that takes your View interface and your Proxy interface.

For instance in murrayc-gwtp-appengine-example’s ThingPresenter:

public class ThingPresenter
  extends Presenter<ThingPresenter.MyView, ThingPresenter.MyProxy>

The presenters’s prepareFromRequest() override uses simple getParameter() calls to get the URL’s parameters, without the need for a separate Place class and manual string parsing. For instance, in ThingPresenter’s prepareFromRequest():

@Override
public void prepareFromRequest(final PlaceRequest request) {
  super.prepareFromRequest(request);
  final String thingId =
    request.getParameter(NameTokens.THING_PARAM_THING_ID, null);

View

Your View should extend GWTP’s ViewImpl and implement your Presenter’s View interface. For instance:

public class ThingView
  extends ViewImpl
  implements ThingPresenter.MyView

However, you will very often want to derive instead from GWTP’s ViewWithUiHandlers, specifying a small UiHandlers interface you’ve created, so your View can notify its presenter about user interaction. For instance, in murrayc-gwtp-appengine-example’s ThingView:

public class ThingView
  extends ViewWithUiHandlers<ThingUserEditUiHandlers>
  implements ThingPresenter.MyView {

That ThingUserEditUiHandler should also be implemented by the presenter. For instance, in ThingPresenter:

public class ThingPresenter
  extends Presenter<ThingPresenter.MyView, ThingPresenter.MyProxy>
  implements ThingUserEditUiHandlers {

And your presenters’s View interface should extend HasUiHandlers. For instance, in ThingPresenter:

interface MyView
  extends View,
  HasUiHandlers<ThingUserEditUiHandlers> {

The View can then indirectly call a method on the presenter like so:

getUiHandlers().onThingChosen("blah");

However, you must first call your view’s setUIHandler() from your presenters constructor, like so:

getView().setUiHandlers(this);

Modules

Each Presenter/View should have a Module. For instance, murrayc-gwtp-appengine-example’s ThingModule links the presenter and view together like so:

public class ThingModule extends AbstractPresenterModule {
  @Override
  protected void configure() {
    bindPresenter(ThingPresenter.class,
      ThingPresenter.MyView.class,
      ThingView.class,
      ThingPresenter.MyProxy.class);
  }
}

These sub-modules would then be combined in one module for the whole application. For instance:

public class ApplicationModule extends AbstractPresenterModule
  @Override
  protected void configure() {
    install(new ThingModule());
    install(new UserProfileModule());

This is where you can control the dependency injection. By providing a different ApplicationModule or different sub-modules, you can cause different implementations to be instantiated. For instance, to create mock Views.

Events: Communication Between Presenters

You application might have several nested presenters on a page, so one presenter might need to respond to a change to another presenter. GWTP lets us do this by defining our own GwtEvent. For instance,  murrayc-gwtp-appengine-example’s ThingUserAddedEvent:

public class ThingUserAnswerAddedEvent
extends GwtEvent<ThingUserAnswerAddedEvent.EventHandler> {

One presenter may then fire that event, by calling its fire() method, providing any parameters needed by the event.

A presenter may handle the event by implementing your event’s EventHandler interface. For instance, in murrayc-gwtp-appengine-example’s UserHistoryRecentPresenter:

public class UserHistoryRecentPresenter
  extends PresenterWidget<UserHistoryRecentPresenter.MyView>
  implements ThingUserAnswerAddedEvent.EventHandler {

And then registering the presenter as a handler for the event, with GWTP’s addRegisteredHandler(). For instance in UserHistoryRecentPresenter’s constructor:

addRegisteredHandler(ThingUserAnswerAddedEvent.TYPE, this);

You must also use GWTP’s @ProxyEvent annotation (see this too) on the handler method:

@ProxyEvent
@Override
public void onThingUserAnswerAdded(final ThingUserAnswerAddedEvent event) {

 

Simple GWTP and Objectify example with Maven

I’ve been playing with GWTP (an MVP framework for GWT) and Google’s AppEngine (via Objectify). I prefer to use the Maven build system. So I created a simple small example that does this: murrayc-gwtp-appengine-example. Improvements are very welcome. I’d like to learn how the code should be improved.

In case it’s interesting to anyone, here’s why I wanted this little example to exist:

GWT

The official GWT website’s equivalent StockWatcher example/documentation does not demonstrate best practices. It:

There is official GWT sample code that uses maven and Objectify, but it’s all mixed up with other stuff and I wanted something much simpler.

There is also an official GWT MVP example from 2010 hidden away, but it’s not in git and I didn’t even find it until I started writing this.

Objectify

Objectify’s documentation seems pretty good now that I look at it again, though I struggled at first. I think Google (and StackExchange) kept taking me to example code that used one of the various older versions of the API.

Oddly, Objectify’s documentation suggests that you use objectify-gwt with GWT. but objectify-gwt doesn’t seem to have any documentation and I didn’t seem to need it myself.

GWTP

The official GWTP website’s documentation is rather scattered, with several simple typos that make its example code snippets inconsistent with each other. The documentation seems to have been restructured from some other source. For instance, several internal links don’t take you to what is apparently intended. Some internal links take you to the page itself.  Lots of the best documentation is spread across their blog posts, but not brought together properly. They do at least show use of maven in their beginner’s GWTP example. but that example code itself is not in version control as a real project. GWTP is a nice clean API that deserves better.

Of course, it is incredibly hard to describe source code in documentation and keep that documentation and source code up to date and in sync. It needs a system. It never works if you just put the source code inline where no compiler can touch it. Although it’s nice to show source code inline (ideally taken automatically from a real source file), you also need links to the real source code so people can see the latest version.