Category Archives: General

The Java Class File Format

I put this on my website a long time ago, maybe around 1998, as an HTML page. This is it moved to my blog.

Contents

Introduction

Compiled binary executables for different platforms usually differ not only in the instruction set, libraries, and APIs at which they are aimed, but also by the file format which is used to represent the program code. For instance, Windows uses the COFF file format, while Linux uses the ELF file format. Because Java aims at binary compatibility, it needs a universal file format for its programs – the Class File format.

The class file consists of some data of fixed length and lots of data of variable length and quantity, often nested inside other data of variable length. Therefore it is generally necessary to parse the whole file to read any one piece of data, because you will not know where that data is until you have worked your way through all the data before it. The JVM would just read the class file once and either store the data from the class file temporarily in a more easily accessed (but larger) format, or just remember where everything is in the class file. For this reason, a surprisingly large amount of the code of any JVM will be concerned with the interpretation, mapping, and possibly caching of this class file format.

Please note that this document is just an overview. The actual Class File format description is in the ‘Java Virtual Machine Specification’ which can be found online here, or in printed form here.

The Start

The Class file starts with the following bytes:

Length (number of bytes) Example
magic 4 0xCAFEBABE
minor_version 2 0x0003
major_version 2 0x002D

The ‘magic’ bytes are always set to 0xCAFEBABE and are simply a way for the JVM to check that it has loaded a class file rather than some other set of bytes.

The version bytes identify the version of the Class File format which this file conforms to. Obviously a JVM would have trouble reading a class file format which was defined after that JVM was written. Each new version of the JVM specification generally says what range of Class File versions it should be able to process.

Constant Pool

This major_version is followed by the constant_pool_count (2 bytes), and the constant_pool table.

The constant_pool table consists of several entries which can be of various types, and therefore of variable length. There are constant_pool_count – 1 entries, and each entries is referred to by its 1-indexed position in the table. Therefore, the first item is referred to as Constant Pool item 1. An index into the Constant Pool table can be store in 2 bytes.

Each entry can be one of the following, and is identified by the tag byte at the start of the entry.

Tag Contents
CONSTANT_Class 7 The name of a class
CONSTANT_Fieldref 9 The name and type of a Field, and the class of which it is
a member.
CONSTANT_Methodref 10 The name and type of a Method, and the class of which it
is a member.
CONSTANT_InterfaceMethodref 11 The name and type of a Interface Method, and the Interface
of which it is a member.
CONSTANT_String 8 The index of a CONSTANT_Utf8 entry.
CONSTANT_Integer 3 4 bytes representing a Java integer.
CONSTANT_Float 4 4 bytes representing a Java float.
CONSTANT_Long 5 8 bytes representing a Java long.
CONSTANT_Double 6 8 bytes representing a Java double.
CONSTANT_NameAndType 12 The Name and Type entry for a field, method, or interface.
CONSTANT_Utf8 1 2 bytes for the length, then a string in Utf8 (Unicode)
format.

Note that the primitive types, such as CONSTANT_Integer, are stored in big-endian format, with the most significant bits first. This is the most obvious and intuitive way of storing values, but some processors (in particular, Intel x86 processors) use values in little-endian format, so the JVM may need to manipulate these bytes to get the data into the correct form.

Many of these entries refer to other entries, but they generally end up referring to one or more Utf8 entries.

For instance, here is are the levels of containment for a CONSTANT_Fieldref entry:

  • CONSTANT_Fieldref
    • index to a CONSTANT_Class entry
      • index to a CONSTANT_Utf8 entry
    • index to a CONSTANT_NameAndType entry
      • index to a CONSTANT_Utf8 entry (name)
      • index to a CONSTANT_Utf8 entry (type descriptor)

Note that simple text names are used to identify entities such as classes, fields, and methods. This greatly simplifies the task of linking them together both externally and internally.

The Middle Bit

access_flags (2 bytes)

This shows provide information about the class, by ORing the following flags together:

  • ACC_PUBLIC
  • ACC_FINAL
  • ACC_SUPER
  • ACC_INTERFACE
  • ACC_ABSTRACT

this_class

These 2 bytes are an index to a CONSTANT_Class entry in the constant_pool, which should provide the name of this class.

super_class

Like this_class, but provides the name of the class’s parent class. Remember that Java only has single-inheritance, so there can not be more than one immediate base class.

Interfaces

2 bytes for the interfaces_count, and then a table of CONSTANT_InterfaceRef indexes, showing which interfaces this class ‘implements’ or ‘extends’..

Fields

After the interfaces table, there are 2 bytes for the fields_count, followed by a table of field_info tables.

Each field_info table contains the following information:

Length (number of bytes) Description
access_flags 2 e.g. ACC_PUBLIC, ACC_PRIVATE, etc
name_index 2 Index of a CONSTANT_Utf8
descriptor_index 2 Index of a CONSTANT_Utf8 (see type
descriptors
)
attributes_count 2
attributes varies e.g. Constant Value. (see attributes)

Methods

After the fields table, there are 2 bytes for the methods_count, followed by a table of method_info tables. This has the same entries as the field_info table, with the following differences:

  • The access_flags are slightly different.
  • The descriptor has a slightly different format (see type descriptors)
  • A different set attributes are included in the attributes table – most importantly the ‘code’ attribute which contains the Java bytecode for the method. (see attributes)

Type Descriptors

Field and Method types are represented, in a special notation, by a string. This notation is described below.

Primitive Types

Primitive types are represented by one of the following characters:

byte B
char C
double D
float F
int I
long J
short S
boolean Z

For instance, an integer field would have a descriptor of “I”.

Classes

Classes are indicated by an ‘L‘, followed by the path to the class name, then a semi-colon to mark the end of the class name.

For instance, a String field would have a descriptor of “Ljava/lang/String;”

Arrays

Arrays are indicated with a ‘[‘ character.

For instance an array of Integers would have a descriptor of “[I”.

Multi-dimensional arrays simply have extra ‘[‘ characters. For instance, “[[I”.

Field Descriptors

A field has just one type, described in a string in the above notation. e.g. “I”, or “Ljava/lang/String”.

Method Descriptors

Because methods involve several types – the arguments and the return type – their type descriptor notation is slightly different. The argument types are at the start of the string inside brackets, concatenated together. Note that the type descriptors are concatenated without any separator character. The return type is after the closing bracket.

For instance, “int someMethod(long lValue, boolean bRefresh);” would have a descriptor of “(JZ)I”.

Attributes

Both the field_info table and the method_info table include a list of attributes. Each attribute starts with the index of a CONSTANT_Utf8 (2 bytes) and then the length of the following data (4 bytes). The structure of the following data depends on the particular attribute type. This allows new or custom attributes to be included in the class file without disrupting the existing structure, and without requiring recognition in the JVM specification. Any unrecognised attribute types will simply be ignored.

Attributes can contain sub-attributes. For instance, the code attribute can contain a LineNumberTable attribut

Here are some possible attributes:

Code Details, including bytecode, of a method’s
code.
ConstantValue Used by ‘final’ fields
Exceptions Exceptions thrown by a method.
InnerClasses A class’s inner classes.
LineNumberTable Debugging information
LocalVariableTable Debugging information.
SourceFile Source file name.
Synthetic Shows that the field or method was generated
by the compiler.

Code attribute

The Code attribute is used by the method_info table. It is where you will find the actual bytecodes (opcodes an operands) of the method’s classes.

The attributes has the following structure:

Length (number of bytes) Description
max_stack 2 Size of stack required by the method’s code.
max_locals 2 Number of local variables required by the
method’s code.
code_length 2
code code_length The method’s executable bytecodes
exception_table_length 2
exception_table varies The exceptions which the method can throw.
attributes_count 2
attributes varies e.g. LineNumberTable

Each exception table entry has the following structure, each describing one exception catch:

Length (number of bytes) Description
start_pc 2 Offset of start of try/catch range.
end_pc 2 Offset of end of try/catch range.
handler_pc 2 Offset of start of exception handler code.
catch_type 2 Type of exception handled.

These entries for the Code attribute will probably only make sense to you if you are familiar with the rest of the JVM specification.

Recommended Reading

Lisp

I put this on my website a long time ago, maybe around 1996, as an HTML page. This is it moved to my blog.

Contents

Introduction

LISP is short for List Programming. It was created by John McCarthy circa 1960, and is used today mainly in AI. There are many dialects of LISP, though Common LISP is now the agreed standard.

LISP is an interpreted, functional language with no in-built Object-Orientated mechanisms (Note from 2015: ANSI Common Lisp does now). Type-checking is loose, and because the program is capable of creating and executing new code during run-time, type-checking occurs at run-time.

LISP is characterised by it’s unusual, but simple, notation. Everything in LISP is a list. Lists are space-separated lists of elements in parentheses. For instance, (a b c).

There is no distinction between the program and the data which the program uses. This allows programs to manipulate their own commands, which in turn allows programmers to act as meta-programmers.

A LISP program may create lists which it may then process. A List may contain other lists or atoms. All lists resolve to atoms, so all Lists which contain other lists are actually processed as lists of simple atoms. This method is intended to allow symbolic programming through use of recursive functions.

This neat concept enables straightforward coding of some classes of problems,  but
makes it harder to code most other programs. Data structure and process are emergent
rather than clearly defined, so the program architecture is not easily apparent.

Evaluation

Using functions

The first element in the list may be a function name or simple operator. The other
elements in the list are then regarded as the parameters to the function.

For instance, (* 2 3) or (foo 2 5).

Almost all expression in LISP are of this form. In fact, many of the languages’s keywords are simply pre-defined LISP functions, which must be used in the same manner as other functions. There are many of these built-in functions, which I will not describe here.

Preventing evaluation

If LISP attempts to evaluate a list which does not begin with a function name (For instance, (1 2 3)) then execution will halt with an error. When such lists are passed as parameters, the quote keyword (or its abbreviation: ‘ ) is used. This  prevents the interpreter from first attempting to evaluate the list into an atom. For instance, ( foo 2 ‘(a b c) ).

Building new lists

To return a list from a function, the list function may be used. Without using this, LISP would attempt to evaluate the list before it is returned.

For instance, (list a b c) evaluates to (a b c)

Note: Programs can force the evaluation of a list using the eval keyword.

Defining functions

The defun keyword can be used to define new functions. For instance,

( defun timesthree(x)
  (* x 3)
)

defines a function which multiplies a value by 3. The new function may be used as so:
(timesthree 2) evaluates to 6.

Conditionals

cond

By using the convention that a zero value is false and a non-zero value is true, LISP allows conditional branching and boolean logic.

The cond function takes a series of condition-result pairs. This construct is similar to a Switch-Case block in C. For instance,

( defun abs-val(x)
  (cond ( (< x 0) (-x) )
        ( (>= x 0) x )
  )
)

if

The if function takes an expression to be examined as true or false, and returns one of its two other parameters, depending upon the result. For instance,

(if (< x 0) (-x) x)

Boolean operators

The and and or functions act as boolean operators. Their left to right checking gives rise to a side-effect which is often used as a conditional branching technique. and stops checking when it encounters one item which is false, while or stops checking when it encounters one true item.

Recursion

LISP has no loop constructs, so recursion is the only way to process data. A recursive function is a function which calls itself. This technique recognises that an operation on some data may be best expressed as the aggregate of the same operation performed on each item of data of which it is comprised. Obviously this technique is best used with data structures which have the same form at both higher and lower levels, differing only in scale.

This focus on recursion is the reason for LISP’s popularity with AI researchers, who often attempt to model large-scale behaviour in terms of smaller-scale decisions. For instance, recursion is often used in LISP to search state spaces.

Many of the lists used in LISP programs would be better referred to as trees. Lists are simply the mechanism used to represent those trees.

The car and cdr functions are generally used to recurse (or ‘walk’) through the elements in a tree, while cons is often used to gradually build build tree structures to form the result of a recursive operation. By also using the null function to test for an empty list, we can walk through the tree structure, dealing with successively smaller pieces of the tree.

car returns the first element of a list. For instance, ( car ‘(a b c) ) evaluates to a.

cdr returns the list with the first element removed. For instance, ( cdr ‘(a b
c) ) evaluates to (b c).

cons is an associated function which is used to build tree structures, often to form the result of a recursive operation. Note that it does not simply concatenate lists, but undoes the effects of a hypothetical use of car
and cdr. For instance, ( cons ‘(a b) ‘(c d e) ) evaluates to ( (a b) (c d e)) rather than (a b c d e).

Note that the use of these functions can lead to a great deal of inefficient copying.

Variables

Global variables

The set function sets the item referred to.  Note that the item does not need to be an explicitly named variable. For instance,

(set x ' (a b c) )
(set (car x) 1 ) - car x refers to 1st element in x.

x now evaluates to (1 b c).

Local variables

The let function declares a local scope.

The parameters to the let function are a list of local variables and a list of expression which may use these local variables. The let function effectively brackets the expressions, providing them with their own local variables. For instance,

(let (a b)
  (...)
  (...)
)

declares a and b as local variables, then evaluates the LISP statements contained in its list.

Bayesian Belief Networks

I put this on my website a long time ago, maybe around 1998, as an HTML page. This is it moved to my blog.

Contents

Introduction

Expert systems often calculate the probabilities of inter-dependent events by giving each parent event a weighting. Bayesian Belief Networks provide a mathematically correct and therefore more accurate method of measuring the effects of events on each other. The mathematics involved also allow us to calculate in both directions. So we can, for instance find out which event was the most likely cause of another.

Bayesian Probability

Bayes’ Theorem

You are probably familiar with the following Product Rule of probability for independent events:

p(AB) = p(A) * p(B), where p(AB) means the probability of A and B happening.

This is actually a special case of the following Product Rule for dependent events, where p(A | B) means the probability of A given that B has already occurred:

p(AB) = p(A) * p(B | A)
p(AB) = p(B) * p(A | B)

So because: p(A) p(B | A) = p(B) p(A | B)
We have: p(A | B) = ( p(A) * p(B | A) ) / p(B) which is the simpler version of Bayes’ Theorem.

This equation gives us the probability of A happening given that B has happened, calculated in terms of other probabilities which me may know.

Note that: p(B) = p(B | A) * p(A) + p(B | ~A) * P(~A)

Chaining Bayes’ Theorem

We may wish to calculate p(AB) given that a third event, I, has happened. This is written p(AB | I). We can use the Product Rule: P(A,B) = p(A|B) p(B)

p(AB | I) = p(A | I) * p(B | AI)
p(AB | I) = p(B | I) * p(A | BI)

so we have: p(A | BI) = ( p(A | I) * p(B | AI) ) / p(B | I) which is another version of Bayes’ Theorem.

This gives us the probability of A happening given that B and I have happened.

This is often quoted as p(H | EI) = ( p(H | I) * p(E | HI) ) / p(E | I), where p(H | EI) is the probablity of Hypothesis H given Evidence E in Context I.

By using the product rule we can chain several probabilities together. For instance, to find the probability of H given that E1, E2 and I have happened:

p(H | E1E2I) = ( p(H | I) * p(E1E2 | HI) ) / p(E1E2| I)

and to find the probability of H given that E1, E2, E3 and I have happened:

p(H | E1E2E3I) = ( p(H | I) * p(E1E2E3 | HI) ) / p(E1E2E3 | I)

Note that p(E1E2E3 | I) = p(E1 | E2E3I) * p(E2E3 | I) = p(E1 | E2E3I) * p(E2 | E3I) P(E3 | I), which can be used to calculate two of the values in the above equation.

An example of Bayes’ Theorem

p(H | EI) = ( p(H | I) * p(E | HI) ) / p(E | I)
p(H | EI) = ( p(H | I) * p(E | HI) ) / ( p(E | HI) * p(H | I) + p(E | ~HI) * p(~H | I) )
H is the Hypothesis ‘Guilty’,
E is an item of evidence,
I is the context.

p(H | EI) is the probability of the Hypothesis ‘Guilty’ being true, given the evidence in this context.
p(H | I) is the Prior Probability – the subjective probability of the Hypothesis regardless of the evidence.
p(E | HI) is the probability of the evidence being true given that the Hypothesis is true.
p(~H | I) = 1 – p(H | I).
p(E | ~HI) is the probability of the evidence given that the hypothesis is not true – this measures the chances of the evidence being caused by something other than the defendant’s guilt. If this is high then naturally the hypothesis will be unlikely.

Assuming Conditional Independence

If, given that I is true, E1 being true will not affect the probability of E2 being true, then a simpler version of the chained bayesian theorem is possible:

p(H | E1E2I) = ( p(H | I) * p(E1 | HI) ) * p(E2 | HI) ) / ( p(E1 | I) * p(E2 | I) )

This version makes it very easy to introduce new evidence into the situation. However, Conditional Independence is only true in some special situations.

Prior Probabilities

One characteristic of Bayes’ Theorem is p(H | I), which is the probability of the hypothesis in context I regardless of the evidence. This is referred to as the Prior Probability. It is generally very subjective and is therefore frowned upon. This is not a problem as long the prior probability plays a small role in the result. When the result is overly dependent on the prior probability more evidence should be considered.

Bayesian Belief Networks

A Bayesian Belief Network (BBN) defines various events, the dependencies between them, and the conditional probabilities involved in those dependencies. A BBN can use this information to calculate the probabilities of various possible causes being the actual cause of an event.

Setting up a BBN

For instance, if event C can be affected by events A and B:

bbnsimple

We may know the following probabilities:

A:

True False
p(A) = 0.1 p(~A) = 0.9

B:

True False
p(B) = 0.4 p(~B) = 0.6

C: Note that when depencies converge, there may be several conditional probabilites to fill-in, though some can be calculated from others because the probabilities for each state should sum to 1.

A

True

False

B

True

False

True

False

True

p(C | AB) = 0.8

p(C | A~B) = 0.6

p(C | ~AB) = 0.5

p(C | ~A~B) = 0.5

False

p(~C | AB) = 0.2

p(~C | A~B) = 0.4

p(~C | ~AB) = 0.5

p(~C | ~A~B) = 0.5

Calculating Initialised probabilities

Using the known probabilities we may calculate the ‘initialised‘ probability of C, by summing the various combinations in which C is true, and breaking those probabilities down into known probabilities:

p(C) = p(CAB) + p(C~AB) + p(CA~B) + p(C~A~B)
= p(C | AB) * p(AB) +
p(C | ~AB) * p(~AB) +
p(C | A~B) * p(A~B) +
p(C | ~A~B) * p(~A~B)
= p(C | AB) * p(A) * p(B) +
p(C | ~AB) * p(~A) * p(B) +
p(C | A~B) * p(A) * p(~B) +
p(C | ~A~B) * p(~A) * p(~B)
= 0.518

So as a result of the conditional probabilities, C has a 0.518 chance of being true in the absence of any other evidence.

Calculating Revised probabilities

If we know that C is true, we can calculate the ‘revised’ probabilities of A or B being true (and therefore the chances that they caused C to be true), by using Bayes Theorem with the initialised probability:

p(B | C) = ( p( C | B) * p(B) ) / p(C)
= ( ( p(C | AB) * p(A) + p(C | ~AB) * p(~A) ) * p(B) ) / p(C)
= ( (0.8 * 0.1 + 0.5 * 0.9) * 0.4 ) / 0.518
= 0.409
p(A | C) = ( p( C | A) * p(A) ) / p(C)
= ( ( p(C | AB) * p(B) + p(C | A~B) * p(~B) ) * p(A) ) / p(C)
= ( (0.8 * 0.4 + 0.6 * 0.6) * 0.1 ) / 0.518
= 0.131

So we could say that given C is true, B is more likely to be the cause than A.

Information Theory

I put this on my website a long time ago, maybe around 1997, as an HTML page. This is it moved to my blog.

Contents

Introduction

Information is a property of data. A piece of data holds more information if its content is less expected. ‘Man bites dog’ contains more information than ‘Dog bites man’

The arrival of each new piece of data is an event. Intuitively, if the event is certain then it provides no information. If it impossible then it provides infinite information. We may represent the Information numerically by using the equation I = log(1/p), sometimes written as I = -log(p), where p is the probability of an event occurring and I is the information provided by that event. This equation satisfies our intuitive ideas about information by providing a value of zero for a certain event and infinity for an impossible event. The value I will never be negative. The base of the logarithm is chosen arbitrarily.

Bits as units of Information

When using logarithms of base 2 to calculate Information, e.g. I = log2(1/p), a value of 1 for I indicates that the event provides enough information to answer a simple yes/no question. There are obvious similarities with the binary system of 1s and 0s. Therefore telecommunications and computer scientists often use base 2 logarithms and refer to each unit of Information as a bit.

In theory any item of information could be conveyed by answering the correct series of yes/no questions. An efficient use of binary storage therefore asks the smallest necessary number of such yes/no questions.

Information in a system (Entropy)

The amount of information in a system is a measure of the number of possible states which it may have. A more disorganised system with more possible states has greater information and is said to have greater Entropy. Systems tend towards greater entropy, thus becoming more disorganised. The classic example is that of a volume of gas which tends to maximise its entropy.

Note that the amount of entropy in the universe can only increase. A system can only become more organised at the expense of increased disorder elsewhere, generally as a dissipation of heat due to work.

Information capacity

The information capacity of a data store is a measure of how many different states it can be in. For instance, an 8-bit byte can store 8 1s or 0s, in 256 possible combinations. 8 = log2(1/(1/256)) = log2(256). Note that some amount of power is always required to maintain the integrity of any data store because, like any system, it will tend towards disorder.

Similarly, the information capacity of a communications channel is a measure of how many states it can be in during a given time period, stated in bits per second. This is a theoretical maximum capacity which depends on the physical properties of the channel rather than the particular method of coding the data. In theory the channel would actually convey information at the maximum capacity if the data was coded in the most compacted form possible.

Signal to noise ratio

Information theory matured in the field of telecommunications, where all communications channels contain some amount of useless noise.

The term is now often used slightly differently to refer to how compactly a message expresses its information. The English language has a high signal-to-noise ratio because in theory many letters and words could be omitted without the reader understanding less.

State Space Search

I put this on my website a long time ago, maybe around 1997 as an HTML page. This is it moved to my blog.

The concept of State Space Search is widely used in Artificial Intelligence. The idea is that a problem can be solved by examining the steps which might be taken towards its solution. Each action takes the solver to a new state.

The classic example is of the Farmer who needs to transport a Chicken, a Fox and some Grain across a river one at a time. The Fox will eat the Chicken if left unsupervised. Likewise the Chicken will eat the Grain.

In this case, the State is described by the positions of the Farmer, Chicken, Fox and Grain. The solver can move between States by making a legal move (which does not result in something being eaten). Non-legal moves are not worth examining.

The solution to such a problem is a list of linked States leading from the Initial State to the Goal State. This may be found either by starting at the Initial State and working towards the Goal state or vice-versa.

The required State can be worked towards by either:

  • Depth-First Search: Exploring each strand of a State Space in turn.
  • Breadth-First Search: Exploring every link encountered, examining the state space a level at a time.

These techniques generally use lists of:

  • Closed States: States whose links have all been been explored.
  • Open States: States which have been encountered, but have not been fully explored.

Ideally, these lists will also be used to prevent endless loops.

FileMaker Templates

I put this on my website a long time ago, maybe around 1996, as an HTML page. This is it moved to my blog.

Introduction

Development and maintenance of complex FileMaker solutions is far easier when you have built your solutions from templates. Because:

  • You’ll be able to throw together solutions in a quarter of the time it normally takes to do a version 1.
  • You’ll know it is robust because the template is tried and tested.
  • You’ll be able to look at each solution and know that it obeys certain set principles, which saves time when fixing problems or adding new functionality.
  • You’ll be able to create a User Manual and Developer Documentation in a fraction of the time because most of that text is the same for all the solutions you create.
  • You’ll be able to store details such as Company Name and Address in a System Constants file and use them all over the system. Just change these System Constants when you sell the system to someone else.

You can download my own templates here (the high-level password is colonelklink):

These templates are arranged in 3 folders:

  • Basic Templates – Regular.FP3 for a regular form view file, Lines.FP3 for hidden files such as Invoice Lines, and others.
  • Functional Templates – a Contacts tracking system, an Invoicing System
  • Other Templates – including a mail-merging system.

These templates represent a considerable head start when developing FileMaker systems. You are welcome to use them but naturally I accept no liabilities and offer no guarantees. If you use them it would be nice to receive a comment.

The following is both a description of these templates and general advice about FileMaker templates. Be sure to read Murray’s FileMaker Advice to learn about basic techniques which should be used in all FileMaker development.

Layouts

  • Menu
  • Edit Details
  • View Details
  • Find
  • List
  • All Fields

Menu

Each file should have its own menu with buttons for Details, New, Find, List.

Edit Details

The buttons at the side include the standard group (Menu, Details, New, Find, List) and the navigation box with CD-type controls and record numbers etc. Specific buttons for reports etc should be place in the second group. When changed, this set of buttons should be grouped with the Navigation Box (see below) and copy-pasted into all other details screens using the Size window to ensure exact positioning.

Having the buttons on the left allows room in the main part for 2 columns of fields. The standard screen is not wide enough for three columns at a readable point size and spreading 2 columns over the whole width wastes space.

There may be several details screens to show various groups of data. All details screens should have the same bar at the top, showing the record number and one or two other particularly important fields. This helps prevent the user from losing track of what record they are looking at, and emphasises that they are looking at different fields within the same record.

View Details

The View Details layout is created by copying and pasting the form section of the Edit Details layout into the View Details layout. Use the Size window to place it in exactly the same position. This should then be sent to the back of the layout, behind the large rectangle which is defined as a button to prevent access. Create a New Tab Order with no tabs to prevent users from tabbing behind the rectangle.

Find

The Find screen is mostly a duplication of the Details screen with some differences.

The Find layout is created by copying and pasting the form section of the Edit Details layout into the View Details layout. Use the Size window to place it in exactly the same position. Remove buttons, such as the circle button in portals, which do not belong on a find screen.

  • The New button creates a new Find request instead of a new record.
  • The navigation buttons do not perform any processing. (see below).When changes are made to the details layout, the fields should be re-copied to the Find layout. The Find button uses a trapped find script which does not allow the user to ever be left on the Find layout in Browse mode. The Find layout may be in a different colour to emphasise that the user is in Find mode.

 

Once the find is performed the script will go to the Details or List layouts depending on whether more than 1 record was found.

List

All files should have a list layout. Clicking on the button on the left takes the user to the Details for that record.

The [Sort] button allows users to sort by any order. The Column headings could also be defined as Sort buttons but this is difficult to maintain.

All Fields

This layout is for general system maintenance and should be used by scripts whenever it is absolutely necessary to use cut/paste instead of set field().

This layout can be updated by creating a new layout, Copying all fields, deleting the layout, going to All Fields, Deleting All Items, and Pasting the new fields.

The Navigation Box

This set of fields and buttons allows the user to move through the records without having to open the status panel. When in browse mode, the buttons also ensure that the SUB.Process script is called on each record. The panel also includes a [Del] button.

Scripting

SUB.Process

It is sometimes necessary to ensure that data is validated or updated using scripts whenever it is changed. Ensuring that all buttons call this empty script allows the developer to insert such processing in future without having to modify all the button scripts. Therefore, all new Button scripts should be created by duplicating an existing Button script.

N.B. This method will allow the developer to create indexed versions of un-indexable fields for fast finds and extra relationships.

Scripts Menu

The scripts menu should be used as a menu to move between areas in the entire system. This also discourages use of the Window menu.

Special Fields

Sys.Found, Sys.Records

Used by the Navigation Box.

Sys.Creation Date, Sys.Creation Time, Sys Modification Date, etc

Additional Information to help the developer maintain database. Remember, these fields are usually not auto-entered in imported records.

G.Help.Code, G.Help.Applescript

Can be used for context-sensitive help. Specific Help buttons set a Help Code in G.Help.Code and a SUB script then displays the relevant text from the ‘System Help’ file. Because the FileMaker message box is of limited size, the G.Help.Applescript field can be used to generate an Applescript dialog box. The field already contains the necessary Applescript but needs to be altered to refer to your help file via a relationship.

Sys.StatusCurrentError

Used by scripts to trap errors. See the ‘BUTTON.Find’ script for example.

Sys.File Name

see ‘Global Fields’ below.

LINK

This calculation field is always 1. It can be used by a relationship to get at values in the System Constants file.

Global Fields

‘Sys.File Name’ is used on layout headings and in the label for the Primary Key. It should be entered as a singular rather than plural. The ‘s’ has been added to the layout where necessary. Calculated plural and singular fields would not show up in Find mode. Because this field is wiped when a clone is saved, the script ‘SUB.Set Sys.File Name’ is called from the Open Script. This resets the filename if it is empty. Therefore a new filename should be entered in All Fields and in this script.

Global Fields must not be used for button names because these fields will be wiped when a clone of the file is saved.

Portals

The file contains an example of a portal. This same arrangement should be used for all Portals. The portal has a circle button at the left which takes the user to the details for that item. If the portal is being used to add lines then the portal should have a Del button at the right.

Symbolic Logic

I put this on my website a long time ago, maybe around 1996, as an HTML page. This is it moved to my blog.

The structure of this explanation is lifted from ‘An Introduction to Symbolic Logic’ by Susanne K. Langer, without her permission.

1.

Content:

The things or material in a system.

Form:

The way in which the contents are related in a system.

Abstraction:

Separating Form from Content, sometimes by discovering analogies.

Interpretation:

Finding possible Content for Forms.

2.

Degree:

Number of Elements used by a Relation. e.g. Dyadic: ‘is north of’, Triadic: ‘is
between’.

Proposition:

Asserts that the Elements are related by the Relation. e.g. ‘Edinburgh’ nt ‘Swindon’.

Symbols:

=int means ‘equals by interpretation’. e.g ‘nt2’ =int ‘is north of’.~ means ‘is not
true’. e.g. ~’Swindon nt Edinburgh’.

3.

Context:

Consists of elements and relations.

e.g. K(‘Brighton’, ‘Swindon’, ‘Edinburgh’) nt2 =int ‘cities’2 =int ‘is north of’

Universe Of Discourse:

All Elements in the context. e.g. K(a,b,c,…) K=int ‘cities’

Constituent Relations:

The relations used in the context. e.g. nt2 =int ‘is north of’

Elementary Propositions:

The statements that may be made by relating Elements.

e.g. ‘Edinburgh’ nt ‘Swindon’.

Truth Value:

Whether the Elementary Proposition is true or false. e.g. ‘Swindon’ nt ‘Edinburgh’ is false.

Symbols:

⋅ means Conjunction (‘and’)
∨ means Disjunction (‘or’)
⊃ means Implication (‘implies that’)

Logical Relations:

When the truth of one Elementary Proposition is dependent upon the truth of others they
are Logically Related.

e.g. (‘Edinburgh’ nt ‘Swindon’) ⋅ (‘Swindon’ nt ‘Brighton’) ⊃ (‘Edinburgh’ nt ‘Brighton’)

System of Elements:

Context with Elementary Propositions connected by Logical Relations

4.

Variables:

As in algebra. e.g. x means ‘Edinburgh’ or ‘Swindon’ or ‘Brighton’ etc.

Allows us to summarise Logical Relations of the same form.

e.g. (a nt b) ⋅ (b nt c) ⊃ (a nt c)

Quantifiers:

The Logical Relation may hold for All or Some of the variables.

Universal Quantifier:
(a) means ‘for all a’. e.g. (a) : (a nt b) ⊃ ~(b nt a)

Particular Quantifier:

(∃a) means ‘for at least one a’. e.g. (∃a): (a nt ‘Swindon’)

Propositional Form:

Elementary Proposition or Logical Relation using Variables, whose Truth Value would depend upon the actual Elements substituted. e.g ‘a nt b’.

General Proposition:

Propositional Form with Quantifiers.

5.

Class:

∈ means ‘is a member of’. e.g. ‘Murray’ ∈ B, where B =int ‘Class of Humans’.

General Propositions (see above) concern Classes of Elements.

Defining Form:

Defines the Class in terms of Propositional Forms. The Class contains all elements for which the Propositional Form is True.

Intension:

Meaning of a concept. e.g. the class ‘town’.

Extension:

The elements to which the concept applies. e.g. the class of ‘towns’.

N.B. Classes with unrelated Intensions may share some Elements in Extension.

e.g. ‘Towns north of Swindon’ and ‘Towns with Universities’.

Inclusion:

(x): (x ∈ A) ⊃ (x ∈ B) means Class A is included in Class B, by stating that any Element in A is therefore in B.

Unit Class (I):

Has one member, meaning that if two Elements are both in A then they must both be the
same Element.

(∃x) (y): (x ∈ A) ⋅ [ (y ∈ A) ⊃ (x=y) ]

The Null Class (o):

Has no Elements. There is a single Null Class, because two null classes could not be distinguished.

The Universe Class:

Contains all Elements. There is a single Universe Class.

Mutual Inclusion:

Classes have same Elements so each Class is included in the other.

(x): [ (x ∈ A) ⊃ (x Î B) ] ⋅ [ (x ∈ B) ⊃ (x ∈ A) ]

Class symbols:

< means Inclusion. e.g. A < B means (x): (x ∈ A) ⊃ (x ∈ B)
X means Conjunction (and). e.g. A X B means the Elements which are in A and in B. X is often omitted e.g. AB
+ means Disjunction (or). e.g. A + B means the Elements which are in A or in B.
– means Complement. e.g. -A means the Elements not in A.
= means Mutual Inclusion e.g. A = B means (A < B) . (B < A)

N.B. A<A, A<I, 0<A

Mutual Exclusion:

Classes have no Elements in common.

A X B = o

Dichotomy:

The fact that I = A + -A.

Exclusion:

A < -B means A is excluded from B.

7.

Predicate:

e.g. A, -A, A X B, A + B

Predicative Propositions:

Propositions about Predicates

System of Classes:

K(a,b,c…) <, similar to System of Elements but with Classes instead of Elements and < as the constituent relation.

Dots instead of brackets:

e.g. :. a : b . cd : e instead of ( a . ( b . ( c . d) ) . e)

You’ll get the hang of it.

8.

Calculus of Classes:

Describes the System of Classes for all Classes, just as the Calculus of Numbers
describes the System of Numbers for all Numbers.

Shows how to deduce some Propositions from others.

Postulates.

Basic Propositions of the system. e.g. (a, b) . a + b = b + a.

There are ten Postulates of the Calculus of Classes, analogous to the uses of Venn Diagrams.

Axioms:

Self-evident Postulates that are assumed because they can not be deduced.

9.

Boolean Algebra of Classes

Generalised Calculus of Classes, just as Algebra of Numbers is generalised Calculus of
Numbers.

Laws of Duality

Conjunction can be defined in terms of Disjunction and vice-verse.

Primitive Propositions

The propositions used to prove theorems in Boolean Algebra. They may use either
Conjunction or Disjunction. They show the following:

Operational Assumptions:

Existence of complements, sums, products.

Existential Assumptions:

Existence of Universe Class, Null Class, more than one Class.

Laws of Combination:

Tautology e.g. ‘a + a = a’
Commutation e.g. ‘a + b = b + a’
Association e.g. ‘(a + b) + c = a + (b + c)’
Distribution e.g. ‘a + (b X c) = (a + b) X (a + c)’
Absorption e.g. ‘a + ab = a’

Laws of the Unique Elements:

Universe Class e.g. ‘a + 1 = 1’
Null Class e.g. ‘a + 0 = 0’

Laws of Negation:

Complementation e.g. ‘a + -a = 1’
Contraposition ‘a = -b . ⊃ . b = -a’
Double Negation e.g. ‘a = -(-a)’
Expansion e.g. ‘ab + a-b = a’
Duality e.g. ‘-(a + b) = -a X -b’

10.

System in Abstracto.

K R, where K is the universe of something and R is some way of relating these things.

Properties of Relations:

Reflexiveness e.g. ‘(a). a R a’
Symmetry e.g. ‘(a, b). a R b . ⊃ . b R a’
Transitivity e.g. ‘(a, b, c): a R b . b R c . ⊃ . a R c’

11.

Propositional Calculus.

Uses a Universe of Propositions which are either True (1) or False (O)

p means ‘p is true’ or ‘p=1’, leading to ‘p=(p=1)’.

12.

Calculus of Elementary Propositions.

Used by Principia Mathematica by Russel & Whitehead. Improves on above flawed notation.

† means ‘it is asserted that’ e.g. ‘†: p V q . ⊃ . q V p;

13.

Function and Argument:

A Proposition consists of a Function and Arguments.

e.g. Ï•x instead of p, where f is the function and x is the argument.

We may quantify the argument instead of the whole proposition to show that functions which are not identical are formally equivalent, allowing us to express ‘(x): mortal(x) =
will_die(x)’

Logistics:

Seeks to create a logical foundation for mathematics.

Recommended Reading:

An Introduction to Symbolic Logic: Susanne K. Langer

Godel, Escher, Bach : An Eternal Golden Braid, Douglas R. Hofstadter

Murray’s FileMaker Advice

I put this on my website a long time ago, maybe around 1995, as an HTML page. This is it moved to my blog.

Introduction

FileMaker is a relatively low-powered database program whose primary advantage is its ease of use. Compared to more ‘standard’ client-server database systems, solutions can be created very quickly. And because FileMaker focuses on a higher level of abstraction, these solutions have less unnecessary complexity and obscurity. These are serious advantages, because almost every system will need to be seriously rethought (‘refactored’) at some point in its development.

However, it is not, as claimed, fully relational, and it does not allow code reuse. These disadvantages can lead eventually to overstretched and disorganised systems. If you use FileMaker you should recognise these limitations.

The FileMaker Trap…

People get off the ground quickly with FileMaker. But often with uninformed confidence they add complication, redundancy and duplication until they are no longer able to predict the consequences of any more changes. The company’s administration may become led by the eccentricities of their system. Employees may be so confused by the bizarre sequence of actions which they perform that they no longer have any insight into their role in the business. This is not productive.

Developers do not just need technical skills, they must understand how to deal with complication itself. If you do not recognise and manage complexity it will eventually overwhelm you.

…And How To Avoid It

By using a little discipline from the start you can postpone the point at which the system becomes too complicated to develop further. If you are disciplined enough then you can postpone it indefinitely.

Self -Documentation

It is impossible to completely document a system which is continually being developed. Denying these facts can lead to confusion and cover-ups. The only way to ensure that a system’s design and operation can be understood is to:

  • Explain the general principles of your design, the principles behind its major components and any unusual techniques employed.
  • Use methods such as those below to make the interdependencies in your system obvious.

Scripts

Use prefixes to indicate how the script is used. For instance:

BUTTON.Invoice Details
SUB.Log Print Date

Rearrange the script list so the BUTTONs are grouped together appropriately. Use empty
scripts to divide scripts into groups.

Fields

Indicate the source of a lookup or reference by prefixing the field name with the
source. For instance:

Customers..Name
Customers..Address

If a field is calculated from another field, indicate this in the field name. For instance:

Order Date
Order Date.Year

Strive to identify distinct groupings of fields and identify them by giving all the
fields a prefix. For instance:

Car.Model
Car.Registration

These techniques will group the fields together when using the standard alphabetical view, making the database structure far more obvious.

Remove Garbage

It is essential to remove any unused part of a system. In the ongoing development of a system, parts will inevitably be discarded. It it important to remove these from the system. In general, they will not have any impact on performance, but their presence implies that they are still relevant to the whole. At least mark them as unused. Otherwise they are a smoke-screen to anybody trying to understand the system.

Unused Fields

Do not delete unused fields because that would affect any scripted import orders. Rename them with a z. prefix so that they show up at the bottom of the list and turn off their indexing. When adding fields in future you may simply rename these fields and redefine them.

N.B. Use the Overview in the Password sub-menu to see which layouts a field is on.

Unused Scripts

FileMaker will not warn you when deleting scripts which are used by other scripts. Using the above script name prefixes will help in determining a script’s dependencies. Instead of immediately deleting a script, mark it with a prefix of UNUSED. and insert a Show Message script at the start. After some time, if this script has not displayed its presence, you may delete it.

White Space in Calculations

Use spaces and returns to make your calculations readable. Show nested IFs like so.

IF(test1;
  resultA;
  IF(test2;
   resultB1;
   resultB2
  )
 )

The results should be indented by 1 more space than the IF and the ). Similar methods can be employed for Choose() and Case() statements. This should be second-nature to C or Java programmers.