7.5/10. Finished a few weeks ago this is another (rather earlier) example of Hastings’ skill in writing penetrating and engaging military history, as well as his willingness to be critical of existing ’sacred cows’. Among other things Hastings:

  • Argues that the famous Mulberrys were probably a waste of time and resources.
  • Shows how the Air Force extreme unhelpfulness (largely driven by their own ambitions and obsession with civilian bombing) was a serious handicap to the whole campaign.
  • Supplies a sharp corrective regarding Patton’s reputation, pointing out that up against reasonable German opposition Patton did little better than anyone else.
  • Shows clearly how it was Hitler, almost more than anyone else, who contributed to the disastrous collapse of German forces in August-October 1944 by his insistence that no retreat of any kind be considered.
  • Provides many examples of the poor quality of equipment, leadership, and men, especially among the American forces and how these deficiencies hindered the Allied campaign. In particular, Allied tanks were almost never a match for their German counterparts and on any occasion that Allied and German troops met on anything near equal footing the Germans won.[^1] In addition he details several clear cases of simple cowardice or unwillingness to fight among the Allied troops and/or extremely poor leadership stretching from the lowest levels to the highest. This is not to criticize — who can say what they would do in such circumstances — and in many reflects the fact that while the Germans were a nation that had for many years been ‘obsessed’ with soldiering the Allied troops were ‘civilians in uniform’, but it does supply a useful corrective to those rose-tinted visions supplied by films such as The Longest Day or the newsreel footage showing Allied soldiers racing past cheering French civilians.

Finally, and as an aside, while good, the book also displays the limitations of the traditional book format as a method for presenting this sort of material (i.e. military history with its strong connections between the temporal and spatial aspects of events). At least for me, the attempt to render particular troop movements, or the direction of battles, in prose never really succeeds and one finds oneself constantly flicking back to the (rather limited) maps in an attempt to connect the descriptions of events, the failures and successes of particular thrusts, with their location, both geographically and within the overall direction of the campaign. Thus, it seems to me that it is that this kind of subject is the sort thing most suited to being integrated with the kind of approach proposed by the Microfacts / Weaving History project currently in the early stages of its development at the Open Knowledge Foundation. Here one would be able to marry maps with descriptions, photos with actions, time with space to provide a much clearer insight into what was going on.

[^1]: From p. 84 ff. “The American Colonel Trevor Dupuy has conducted a detailed statistical study of German actions in the Second World War. Some of his explanations as to why Hitler’s armies performed so much more impressively than their enemies seem fanciful. But no critic has challenged his essential finding that on almost every battlefield of the war, including Normandy, the German soldier performed more impressively than his opponents:

On a man for man basis, the German ground soldier consistently inflicted casualties at about a 50% higher rate than they incurred from opposing British and American troops UNDER ALL CIRCUMSTANCES . [emphasis in original] This was true when they were attacking and when they were defending, when they had local numerical superiority and when, as was usually the case, they were outnumbered, when they had air superiority and when they did not, when they won and when they lost.

It is undoubtedly true that the Germans were much more efficient than the Americans in making use of available manpower. An American army corps staff contained 55 per cent more officers and 44 per cent fewer other ranks than its German equivalent. …

Events on the Normandy battlefield demonstrated that most British or American troops continued a given operation for as long as reasonable me could. Then - when they had fought for many hours, suffered many casualties, or were running low on fuel or ammunition - they disengaged. The story of German operations, however, is landmarked with repeated examples of what could be achieved by soldiers prepared to attempt more than reasonable men could.”

Consider a metal arm fixed by a pin. If it is hung vertically then the arm, no matter where it starts, will always end up in the same position. However, if you fix the arm (perfectly) horizontally it will stay forever in its initial position. The first case is ergodic: we converge independent of the starting point to some particular configuration; while the second is ‘path-dependent’ (or dependent on initial conditions): where you end up depends crucially on where you start. The question:

Is animal/technological/historical/linguistic evolution ergodic or path dependent?

More generally, how ergodic or path-dependent are the following processes?

  • (Natural) Evolution
  • Technological change
  • Human history
  • Communication systems such as natural languages
  • Other symbol systems (e.g. games or mathematics)

Versioned Domain Models

March 22nd, 2007

I’ve been thinking about how to have a versioned domain model similar to the way we have versioned filesystems (e.g. subversion) for over two years. Over the last few months whatever bits of free time I’ve had have gone into developing a prototype built on top of sqlobject and I’ve now got a rough and ready (but fully functional) library:

http://project.knowledgeforge.net/ckan/svn/vdm/branches/sqlobj/

A demo of how it is used is best shown by the tests:

http://project.knowledgeforge.net/ckan/svn/vdm/branches/sqlobj/vdm/dm_test.py

Why be tied to SQLObject: obviously being so directly tied to sqlobject is not such a great thing but I intentionally chose to build on it because so many people will already be writing their domain models using SQLObject.

The Robustness Principle

February 22nd, 2007

2.10. Robustness Principle

TCP implementations will follow a general principle of robustness: be conservative in what you do, be liberal in what you accept from others.

Source:

Thinking about Annotation

January 17th, 2007

Annotation means the adding of comments/notes/etc to an underlying resource. For the present I’ll focus on the situation where the underlying resource is textual (as opposed to being an image, or a piece of film or some data). Various things to consider when implementing an annotation/comment system:

  1. Addressing and atomisation: Are annotations specific to particular parts of the resource. If so how do we store this address (relatedly: how is the resource ‘atomised’ and how to we address these atoms, or range of atoms). For example, do we address by word, by character, by paragraph or by section? Do we wish to store ranges rather than a single address? Do we wish to allow a given annotation to be associated with multiple ranges/atoms?

  2. Permissions: Are there restrictions on the creation (deletion/updating etc) of annotations.

  3. Will the underlying resource change and if so are annotations intended to be robust to those changes.

Let’s concentrate on the first issue for the time being as it is the most immediately important. Furthermore, defining the ‘atoms’ of the resource sharply narrows the implementation options.

The Simple Case: Mod a Blog

If one is happy to have fairly large atoms (pages, or even sections of some piece of text) then implementing an annotation system can be reduced to grabbing your favourite CMS or blogging software and feeding the text in in appropriate chunks. This is often satisfactory and is a simple, low tech solution that will pretty much work out of the box. A classic example of this approach is http://www.pepysdiary.com/ which works so well because the subject matter (Samuel Pepy’s diary) has a very obvious atomisation (namely the daily diary entries) suited perfectly suited to blog software (in this case movable type).

You can even start doing a bit of modding, for example to present recent annotations (http://www.pepysdiary.com/recent/) or to present the text plus annotations all in one piece. (Given that commentonpower seems to fall neatly into this category with most commentable atoms of the right size for ‘blog’ entries I wonder why they didn’t just implement it as a plugin for wordpress — perhaps it was such a simple app that it easier to ‘roll their own’).

Getting More Atomic

Once you want to have atoms below a size comfortable for individual html pages/blog entries, wish to allow people to comment on chunks too large for an individual page, or to comment on ranges one starts to have problems with this approach. The main challenge at this point is to find some way to extract the addressing information from the client doing the annotation. Confining ourselves to the web the challenge becomes way to structure the interface and the text so that one can determine range start and end points. This is a non-trivial matter. Possible options include:

  • Javascript: in theory the selection/range objects should help us out here unfortunately cross-browser support is patch (firefox as usual is excellent and IE pretty bad). If one does not want to be as precise as to get ranges javascript could also be used to extract e.g. element ids.
  • Copy and paste of the quote to annotate with some backend algorithm to determine the actual range. Nice and simple but not clear that one can ‘invert’ (i.e. find a unique range from a given selection) unless the selection is large.
  • If addressing fairly large atoms (e.g. a paragraph or large) one could just insert a unique piece of user interface equipment (e.g. a button or link) with each atom. Note however that this prevents support for ranges.

Separating Data and Presentation

Whatever one chooses to do it does seem sensible to clearly separate data and presentation. This is particularly important when there is so much uncertainty over the user interface. In particular, it would be good to clearly specify the annotation format and implement a programmatic interface to it independent of the standard (human) user interface. That way is easy to switch interfaces (or have multiple ones). Given that annotations are essentially just a comment it would seem sensible to try and reuse an existing format such as Atom (or RSS) for the machine interface to the comment store. [marginalia] already had such a format based on atom. I’ve recently reimplemented a stripped down version of this format for the annotation store backend in python in preparation for adding annotation support to openshakespeare web interface, see:

http://project.knowledgeforge.net/shakespeare/svn/annotater/trunk/

Of course as discussed above this isn’t quite as simple as it looks as your user interface can constrain what you can and can’t store (using a blog approach you can’t store ranges and from what I have read getting reliable character offsets is problematic). Nevertheless it seems the best place to start.

Technology and History

May 28th, 2006

Found in a review by Gary Will’s of Taylor Branch’s At Canaan’s Edge: America in the King Years, 1965-1968 in the NYRB (2006-04-06, p. 20):

It is amazing how Branch can marshal so much material along so many tracks, moving it ahead stage by stage in coordination with King’s actions. The I saw Branch in a three-hour television interview with C-SPAN and learned part of his secret. He showed the interviewer his computer with its expertly programmed chronological record of all the information he had acquired from so many sources — over 17,000 items arranged year by year, day by day. The book probably could not have been written — surely not in so relatively short a time — without the computer.

The Invention of Symbols

November 3rd, 2005

We believe that we invent symbols. The truth is that they invent us. Gene. M. Wolfe, The Book of the New Sun.

  1. Uncodified knowledge cannot be transferred except by f2f interaction (apprenticeship etc)
  2. But knowledge codification is very time and space consuming (and much still remains implicit)
  3. As the amount of codified knowledge grows it becomes harder to find what you want

Hypothesis: Value of Information in Databank = Value of Information if it could be Accessed Perfectly x Ease of Finding Any Particular Item

Plausible to assume Ease of Finding Information = h(Amount) where h’ less than 0

  • Let Amount = n
  • In standard Computer Science if we could sort items in some manner (by which we could also search). h(N) = log(n) (and sorting costs are n log(n) - bubble sort)
  • Suppose only option is brute comparison (and it is useful to find a negative i.e. that what you want isn’t in there). Then this suggests E(search time) = n/2 and h(n) = 2/n

Plausible to have diminishing return for Value of Information if it could be Accessed Perfectly = f(Amount). So f” less than 0. Thus f grows at less than linear rate (eventually …). * If h has form suggested i.e. 2/n then we would have eventually Value of Information Bank is /decreasing/ in amount of information in databank

Example: explaining how to use a computer …

Info on Size of Databanks

  1. How Much Information? Varian and Lyman, http://www.press.umich.edu/jep/06-02/lyman.html
  2. Ithiel De Sola Pool. Communications Flows: A Census in the United States and Japan. Elsevier Science, New York, 1984

The Nature of Information

February 14th, 2005

Coining an aphorism: We are moving towards a world in which all information is software and all software is information

Plan

  1. We process information linearly. This is a fundamental fact. (Aside: example of polyphonic music and the Glenn Gould radio program). Symbol processing in home sapiens is serial and cannot manage either parallel or non-linear presentation. Particularly textual symbol processing. This is not only related to the methods by which humans obtain sensory input but derives from the very structure or high level information processing in the brain. This is manifested very clearly in language.
  2. thus even where information is presented non-linearly, or more commonly in parallel, we still create our own linear thread as we progress through it. A concrete example is given by the internet or by encylcopedias. Though both examples present a web of information rather than an explicit linear narrative the human mind cannot branch multiply in any literal sense. Thus as I progress through a website or an encylcopedia though I may branch I then leave the original line of investigation - perhaps to return later.
  3. Given this fact that we can only read along one dimension at once we see the great challenge or all analytical writing, namely to present in single-dimensional linear form, that which is always multidimensional and non-linear.
  4. Thus we are presented with a dilemma. Much knowledge and information is multi-faceted, approachable from many different angles simultaneously, yet if it is to be understood and processed by humans it must be presented serially, that is to say linearly along a single path. Now I do not suggest that we can overcome these inherent limitations but I do suggest that we can approach knowledge storage and categorization in such a way as to impose the minimal limits on the possible methods of presentation.

The Metaphor

We can imagine the building blocks, the factlets, as pearls, little pearls of knowledge. We can then imagine the creation of an expository line, or narrative if we allow ourselves to abuse terminology, as the stringing of these pearls onto the thread - the thread of narrative - which when complete provides a ‘necklace’ of exposition (NB: though we should avoid seeing any cyclical structure in analogy with the circular necklace as it is more usual for a exposition to resemble an interval with a beginning and end and a direction of progression).

Other Items

The multiple classification problem. Analogies and examples:

  1. no canonical basis vectors for a finite-dimensional vector space.
  2. The borges story cited by foucault on the chinese emperor’s encyclopedia

The Art of Writing History

That most history writing, even of the analytical variety, consists of linear exposition. I often describe this as a narrative but this is dangerous as narrative usually denotes a very specific form of linear exposition.

An Example

The example we shall examine is the hundred years war (This is, of course, a subject eminently suited to a narrative historiographical approach). The Hundred Years war describes the century long struggle between the English and French crown for control of France and various of its subdomains. From the very beginning of historiographical interest in these events (e.g. Froissart) the approach taken has been a narrative one. The most recent work in this tradition is the multivolume work by Jonathan Sumption. He encounters a classic problem. How is one to shoe-horn this struggle into the linear strait-jacket of the printed page. For not only do we have the obvious approach given by time’s arrow (which is the backbone of traditional ‘narrative’ in history) but also the thematic structure given by the geographic dispersion of the conflict.

A simple method for visualizing these situations is given by reducing this problem to two dimensions with time on one axis and all other themata being put along the other axis:

(themata) English throne French throne Charles the Bad King of Navarre Major Battles ….
 Time
  ||
  ||
  \/
…. …. …. ….

Further work: detailed examination of chapters in vol. 1 of Jonathan Sumption’s History of the Hundred Years War