Started new contract this week for a vertical market internet startup in Munich. Very dull so far.
Built a nifty new i815e-based PC for home, but I’m waiting for RH7, so there’ll be no coding this week.
Started new contract this week for a vertical market internet startup in Munich. Very dull so far.
Built a nifty new i815e-based PC for home, but I’m waiting for RH7, so there’ll be no coding this week.
My email inbox is full of 6000 identical spams from “Tele Sports Betting”. I cleared my inbox but they’re still coming through. Here’s a bit of the header:
Received: from foxnt.foxnet.org [210.169.138.238] by mx29
via mtad (34FM.0700.3.03)
with ESMTP id 943eiNRR80868M29; Thu, 14 Sep 2000 17:17:59
GMT
Received: from mail.cc.biu.ac.il ([38.33.6.181]) by
foxnt.foxnet.org with Microsoft SMTPSVC(5.5.1877.197.19);
Thu, 14 Sep 2000 02:03:57 0900
To: <lgtyu@sunpoint.net>
Return-Path: vmcepikmaj@esperanto.nu
Released my GNOME C++ Application Framework code: Bakery. It should start some discussion at least.
Released a new version of my MySQL C++ API branch. Generally insulted the original author by saying that I’ve improved
the original code by removing half of it.
Posted a slightly insulting mail to the gnome-devel list
suggesting that they clean up their code and stop breaking
their own rules.
Posted a bunch of patches to Gtk–. Just noticed that the
last 4 posts on that list are by me, all on different
subjects. Must drink less caffeine.
I’m back from a few days walking in the mountains aroundthe Swiss/Austria/Liechtenstein border. I’ll probably head back to the Alps next week if I’ve recovered.
I discovered Havoc Pennington’s Inti project,
which will include yet another GTK C++ wrapper, but this
time with the full support of Gnome and RedHat. I found out
about it from a page on Guillaume’s site explaining
why he left Gtk–. I was gutted about this for a while because it
doesn’t seem like there are enough differences to merit a
separate project, and it’s bound to be a long time until
it’s ready. However, I think I’ve come to terms with it. It
looks like they will have very similar architectures so I
should be able to make my Gtk– apps mostly future-proof by
adding a little extra abstraction. Also, I have a lot of
confidence in Havoc based on his activities so far.
Building C/C++ libraries with
automake and autoconf and
Using C/C++ libraries with automake
and autoconf
I just finished a closed source project that used all
these techniques and I had to document them anyway. Now it
should be much easier to suggest to maintainers that they
sort out their project files.
I leave my contract today. Finally. Naturally now that I’m gloriously free and unemployed the rain comes down. I’ve suggested to the company that I’ll maintain the code for free if they make it open source, but I don’t think they’ve got the balls to do that, even though it would mean they sold loads of their core products.
The UK parliament passed the RIP bill yesterday. The security services will now be connected directly to all ISPs, monitoring everybody’s internet activity and emails. You can use encryption but you’ve got to give the keys to the government or you go to jail. Looks like the UK turned to shit while I’ve been away. Hopefully this’ll disgust people enough that it gets revoked or it doesn’t look like I’ll ever be going home. There won’t be much tech work there soon anyway.
I’ve been using the Xerces C++ XML parser recently. It’s really quality code, by people who really know what they’re doing. There’s some other developers in the same office who are struggling along with MSXML, and I particularly like telling them ‘Yeah, I found a bug, but I told them and they fixed it a couple of hours later’.
This week, I finally get to leave my dull contract. That’s if I manage to get free before the specs shift again. I’m looking forward to a long break. I think I’ll start with a couple of weeks in the Alps then I’ll devote a few weeks to a particular open source project that I’ve been thinking about doing for some time. It’ll be an experiment – seeing if I’m able to give it the same amount of time and attention that I’d give to a employer’s project.
I put this on my website a long time ago, maybe around 1999, as an HTML page. This is it moved to my blog.
Pentti Kanerva’s Sparse Distributed Memory model is based on simple mathematical properties. It can be used to store and recall large amounts of data efficiently, without requiring that the data be completely accurate or that we know exactly what we need to recall.
Though it may not be an exact model of human memory, it shares enough characteristics to suggest that human memory works in a similar way.
The model deals with data in its binary form, without regard to what that data represents. All data can be stored as binary, although some translation format may need to be chosen.
In theory each piece of data is simply one value in the set of possible values that could have been expected. Or one point in the space of possible data points. This space is very simple if the data is binary. The space for a value which is n bits long would have n dimensions, with only two possible points (0 and 1) in each dimension.
For a 2-bit value, imagine a 2 by 2 square. For a 3-bit value, imagine a 2 by 2 by 2 cube. There is no need to visualise values with more bits because the theory is so simple. In fact you may not find it useful to think in terms of a space at all.
The model needs to calculate how much one value is like another. Continuing with the ‘space of points’ idea, we can calculate the difference between two points just as we would for a 2 or 3 dimensional space, using Pythagoras’ theorem. However, for such a simple space Pythagoras seems like overkill so we can use an approximation called the ‘Hamming distance’.
The Hamming distance between two binary values is measured by counting the number of bits which are different. This turns out to be an adequate approximation.
Most locations in a circle are near the edge of the circle. This means that each location is far from most of the other locations. This effect is enhanced when dealing with the binary values, and the Hamming distance. This is demonstrated mathematically and by example in Kanerva’s book.
A value is ‘indifferent’ to another if they differ by n/2 or more bits. Each value in the binary space is indifferent or very nearly indifferent to the vast majority (about 99 percent when n is large) of other values. This characteristic of the space is crucial to Kanerva’s model, because it allows a nearly-correct location to be much closer to the correct location than to the wrong locations.
When a value is written to a location (an address), the value is written into every location in the circle about that location. Likewise, when a value is read from a location it is calculated by examining the contents of every location in the circle about that location. A good approximation of the best match can be calculated by averaging each bit individually. This should not require much more computation than is already required to read the contents of every location in the circle.
Therefore, each address is involved in the storage of many values, and many addresses are involved in the storage of any one value. This is why the memory is ‘Distributed’. Because there is no need for the locations to be physically adjacent, this should make the memory resistant to local damage.
The more often a value is stored, the easier it will be to recall. This also allows older memories to fade gradually.
An n-dimensional binary space has 2n points. When dealing with large values, for instance 1000-bit values, there are far too many points for any computer (or brain) to keep track of them all. Kanerva found that, when he ignored most of the points, the distributed nature of the space meant the model still had the same properties. The points which remain are called ‘hard locations’. Of course we lose some accuracy but it is worth it to make the model possible.
When values are used as the address where they are to be stored, it becomes possible to home-in on a stored value starting from an inaccurate (or incomplete) version of that value. The value read from the inaccurate value-address will be a more accurate version of that value-address because the value was originally written into all addresses in the circle.
If this process is repeated then the address read will converge on the original stored value. If the iterations don’t converge then either the value wasn’t stored, or we have insufficient information.
This is probably what makes the model seem most like human memory – It can recall information even when it isn’t sure what it should recall.
Sparse Distributed Memory, Pentti Kanerva
Artificial Minds, Stan Franklin – Mentions the theory briefly.
Information Theory – Deals, in part, with storing information as binary data.