Saturday, September 10, 2016

What is great in man is that he is a bridge and not a goal


There is no dark side of the moon. Matter of fact, it's all dark. The only thing that makes it look light is the sun.

The typical conception of a resource like "Arxiv" or "Wikipedia" is summed up in the name: it is an archive or an encyclopedia that anyone can edit.  "Stack Exchange" is somewhat less self-descriptive -- but nevertheless, the emphasis on "exchange" is apt, since this site is not quite a gift economy in Eric S. Raymond's sense, but rather a place where questions are exchanged for answers, and both are exchanged for reputation in the form of points and badges.

Nevertheless, in broad brushstrokes all three resources have something much more essential as a common basis: they all grow in the course of use.  That said, they do not typically transform radically along the way.  Arxiv remains an archive; Wikipedia remains a wiki and encyclopedia; Stack Exchange is and always will be a Q&A site focused on questions with specific, usually technical, answers.

But what if we get "outside" of these systems, stop thinking about them as objects, and start thinking about them as (collections of composable) processes?  From this vantage point, a change in type is as likely as a change in magnitude.  For example, we can imagine a next-generation computer system that combines data from these various sources and that can serve as a dialogue partner.

Dubious?  Consider that from our new vantage point outside the system the people who use the resources are indistinguishable from computational agents that accomplish similar tasks.  Look up something here, ask about it there, write about something related somewhere else.

Social scientists have studied interaction with Wikipedia, but a more computational flavour of research asks how the contributed material is organised, and attempts to anticipate what the system would need to do in order to achieve some new processing task.  For example, how much would the system need to know in order to begin to ask and answer questions, or participate in a collaborative problem solving process?

We can get some clues about these speculative questions by considering the current state of such systems.  A resource like Wikipedia or Stack Exchange can be used as a "prism" or "fractionator" that will allow a given input text to be parsed.  Let's consider the mathematical domain, where there is the convenient epistemological gold standard of "proof."

We can use the structure of an encyclopedia-like resource to deduce that "circle" is a basic mathematical concept, whereas "holomorphic function" involves more information and requires more subject-specific background to understand and apply.  This is not to say that "circle" would be an easy concept for an AI system.  Rather, I'm suggesting that the process of solving a mathematical problem, participating in a dialogue, or proving a new theorem involves building up the domain of discourse in which a given term or query becomes meaningful.

In principle, everything is related to everything else.  In practice, some things are more related than others.  Finding points of overlap or connection should allow users to recover the bridging concepts that are useful for answering questions (whether routine or novel).

I've previously done some pilot study work using the idea of "fractionating" texts according to the depth of their constituent terms, and I also looked at timestamped data to try and determine the precipitating causes of a given action from a system.  But more recently, I've been thinking about how to approach similar issues from a simulation point of view.  Thus, e.g., generating texts, or programs, or critiques.

More specifically and technically: what I have in mind is developing generative testing approaches to programmatic interaction that will allow the computer to build out its own system in new directions with minimal guidance from the programmer.  The intuition here is that what we do depends on what we perceive needs to be done in a given situation.  This is relevant to thinking about how the computer could participate in dialogue.

But just now these proposals may seem quite abstract and in that respect rather hopeless. Moving from big collections of words to models of word cooccurrence (for example) is natural enough.  Moving from there to text understanding and dialogue sounds quite a bit harder.

Let me therefor break the proposal down a bit more, along three quite specific directions.
  • [PEER] LEARNING - Organising learning pathways is quite similar to automatic programming, insofar as the pathways need to change and adapt depending on circumstances. If someone doesn't understand "holomorphic function," for example, they might want to review "differentiable function." More broadly, a peer learning experience can help surface and fill in gaps in understanding in an emergent and ad hoc manner.
  • [MATHEMATICAL] COLLABORATION - Links between different resources like Wikipedia, Stack Exchange, and Arxiv are perhaps even more interesting than the resources themselves, from an AI point of view.  Here we consider collaboration "in the large" and the mostly-technical challenge of integrating new knowledge into a large scale model, without disrupting contributors' workflows.
  • [COMPUTATIONAL] CREATIVITY - If we want mathematical agents to help us solve problems (or indeed, to solve problems that we are not able to solve directly) then they need to know what the problems mean, i.e., they need to be able to parse and flesh out our imperfectly expressed ideas and queries.
This trifecta works together to provide a "way in."  Descending from the abstract to the concrete, we can imagine specific computational challenges like: given an answer to a question on Stack Exchange, match it to the question it addresses -- or given a question on Stack Exchange, predict the technical terms that will be found in a good answer.  These challenges are straightforward and we can expect some 5% or 10% success rate out of the box using "standard methods."

The gold standard of proof is likely to come in handy if we ever need to break down large problems into smaller ones.  Thus, instead of trying to deduce A's from Q's or vice versa, we could try to deduce proof steps in a paper from the foregoing discussion, lemmas, and citations.

If the system gets stuck, which it inevitably will do often, we then ask: what does it need to LEARN?  What COLLABORATIVE processes might it participate in to clarify some specific uncertainty?  What new methods might it need to CREATE to answer the foregoing questions on its own?

Friday, September 9, 2016

Creative land: patuki

Creative Land: Place and Procreation on the Rai Coast of Papua New Guinea, By James Leach, page 76


I remember recently having the eerie feeling when I sat down to drink a coffee on my doorstep that we are not so far away from myths and from history after all.  Wouldn't it be weird if "phatic" had yet another etymology -- namely that it was appropriated from the Papua New Guinneans?

Monday, September 5, 2016

Objects in channels

[...] the human mind is dependent for its objects to a great degree upon channels or means that are not under its own control. It is thus dependent on the thousand channels and means by which objects are introduced to it. But we need here only instance that wonderful assemblage in the human body. These organs which we term the senses, one or the other of them, convey to the mind its first object and afterwards all the new objects about which it acts. (Day 1876: 15)
Cognitive channels. Kroeber and some other anthropologists wrote about the cultural function of phatic communion, that it includes the aspect of relating to another person through your mutual relations to cultural events, signs, and texts. There are some quotes about how people used to read each other the news, and this made me think about Facebook feeds, and how social media is a sociocultural infrastructure of sorts, one that facilitates digital news sharing, for example, instead of face-to-face interaction. I wonder how different our "faceless" sociocultural entanglements are from those of pre-networked people. Some lady in late 1920s was equally excited about romance novels, and how the relationship patterns of our imaginations could be contrasted to those of real peoples. I'm not sure if anyone has bothered to summarize that kind of literature, though I know there must be quite a lot in journals like College English or in those interpersonal relationship journals from the late 1990s. Eh, I'd like to acquaint myself more with late 19th and turn-of-the century thought, especially obscure philosophy from archive.org - there's a lot of good stuff hidden in black and yellow.

INLG - CC in NLG Workshop [liveblog]

I'm attending a morning workshop in Edinburgh on Computational Creativity in Natural Language Generation.  My thoughts on the train ride up were pretty interesting.  I was reading the book Renku Reckoner and spotted some pretty cool things in there.  I was particularly struck by some passages about "person and place" in the renga/renku form.

One of the things that was really interesting here is the notion of different kinds of "linking" which I take to be a kind of contact function between verses in the poem.  Because of the specific formal constraints on renku, we get something with a Simondonian flavor:
At its simplest Shofu renku can be seen as a strand of poetry which opens continuously outward. (p. 97)
The constraint is that with a sequence of verses A, B, C, D, verse C should "link" to B, but "shift" away from A.  This link-shift frame moves one step to the right when we come to compose D. So we get a kind of continually-building sequence of associations.  It brings to mind the so-called Chinese perspective (cf., e.g., "Chinese Perspective as a Rational System: Relationship to Panofsky’s Symbolic Form")


The content of individual verses can be analysed in terms of various features -- i.e. the features that are assumed to shift as we move forward in the poem.  One of these is place.
Place encompasses everything from geography to the site of a specific activity -- virtually any stanza that does not show a person or a group of people.  A caterpillar on a leaf, a basket of fruit in the market, a bird in the sky, and so on.  Whether its locale is mentioned or not, any object may be construed as implying its setting, thus qualifying it [...] as place. -- Shokan Tadashi Kondo and William J. Higginson, "Link and Shift: A Practical Guide to Renku Composition"

More classically, Tachibana Hokushi pointed out several different levels or types of 'person', which remind me of Ruesch's divisions:

  • ji - self, first person experience
  • ta - other, the experience of another person
  • ji-ta-han - self and other, the experience of oneself with another or others
  • ashirai - public or mixed, the experience of a group of people, even if indistinctly drawn
  • ba - place, an event or scene without human involvement, thoug hartefacts may be included

The links themselves can be divided into different layers: 'word' or 'object' links are the most basic.  'Content' and 'meaning' links are more subtle.  The most subtle is something called 'scent links' or nioizuke links, which "evokes a much more tenuous set of associations which are nowhere specified in the text itself."  This as a specific interesting effect: "The reader is obligated to engage with the poem as an active interpreter."

That -- interestingly enough -- is what McLuhan is talking about vis à vis "Hot" media [I think...].

Pierre-Luc Vaudry and Guy Lapalme
Assembling Narratives with Associative Threads

This paper sort of went over my head.  They referred to Zwann 1995 on different kinds of links between texts, like causality.  That sounded interesting.

Kaori Kumagai, Ichiro Kobayashi, Daichi Mochihashi, Hideki Asoh, Tomoaki Nakamura and Takayuki Nagai
Human-like Natural Language Generation Using Monte Carlo Tree Search

This paper is about using a technique similar to the technique used in computer board game playing to generate sentences.  A challenge to this approach is that board games have win-lose propositions, whereas evaluating generated sentences isn't so obvious.  The approach the authors use here seems to favour 'typicality' of generated sentences and co-occurring words.  What seems to be missing is a context-specific evaluation (e.g. typicality when confronted with such and such a situation, which is closer to the game-playing model).  Questions point out difficulty with sentences like "the bread eats the dog" or similar ambiguity between "love ?? man ?? woman".

Pablo Gervás
Empirical Determination of Basic Heuristics for Narrative Content Planning

Distinction between "what is told" and "how it is told" is a typical Saussurean split.  "Focalization" (a term from Genette) tells the story from the point of view of a given character -- here cf. Sloterdijk and the notion of 'phatic architecture'.  

Something also to with object persistence -- characters don't stop existing when they aren't "in the frame."  (But what about something like Amores Perros, in which there are several entangled stories? -- so-called Hyperlink cinema.)

As described by Edward Soja and Costis Hadjimichalis spatial analysis examines the "'horizontal experience' of human life, the spatial dimension of individual behavior and social relations, as opposed to the 'vertical experience' of history, tradition, and biography." -WP
Joseph Corneli and Daniel Winterstein
X575: writing rengas with web services 


(Full set of slides {here}.)

Satoshi Sato
A Challenge to the Third Hoshi Shinichi Award

Hoshi Shinichi was one of the big 3 Japanese short story writers.  His favorite characters were non-human (e.g. robots, aliens).  Plots are hand-programmed as part of the grammar.  Grammar generation is hard.

Louisa Pragst, Juliana Miehle, Stefan Ultes and Wolfgang Minker
Automatic Modification of Communication Style In Dialogue Management 

'Verbosity' and 'directness' can be varied.  ("Many people take an asprin when they have a headache.")  Verbosity is interesting as an example of and/and/and discourse.  "How do we determine if this is relevant for the user?"  One approach is to see whether the various anded terms always appear together in a corpus.  Directness might be set up as a specific response to a user's request for information.  Check e.g. if the user understood the provided information.

We can't really adapt the situation without understanding the high-level "goals" or something similar.  A pipeline like this:

Language analysis -> Dialogue manager (with reference to a knowledge base) -> Language generation

also needs some sort of pragmatics behind it.  E.g. in a healthcare scenario there is an obligation to help. 

Eugenio Concepción, Gonzalo Mendez and Pablo Gervás
Mining Knowledge in Storytelling Systems for Narrative Generation

The talk at least contains a nice survey of computer fiction systems and the historical dimensions that storytelling systems consider.  The authors are trying to come up with a common structure for representing all of this stuff.  (Which reminds me of D. Winterstein's effort to build a common structure for representing different kinds of poetry, or the interesting regular expression based structuring in the new clojure.spec library...)

Writing itself is referred to a writing and revision cycle that could explore the different facets of structure.  I'm reminded of Hjelmslev's theory (he talks about structure and process -- and also seems to offer a reminder that just representing structure isn't going to fly).

Stephen McGregor, Matthew Purver and Geraint Wiggins
Process Based Evaluation of Computer Generated Poetry

How do people think about computer poetry?  We thought that people would be very sensitive to procedural descriptions, but at least at this point in history, they aren't.  "We should be suspicious of mystically inspired computers."

Creative: reaction to the context.  Context also shows up by analysing e.g. "flowers" and "romance" and taking the dimensions that are high for both.  (Does this just correspond to picking the words that co-occur with both?)  You can impose a topology/topography (how?).

A first-person narration by the computer as "I" obscures the computational process.  How about presenting the computer in a more straightforward way?

... But, I think the discussion of how the poem was made was just not that interesting, and similarly, the poem itself was not very interesting.  

Q. Why is it SO bad? 

A. Perhaps because we're presenting such an odd collection of words.  However, mixing up different themes could be a good thing, if the computer wants to combine the themes.  E.g. "Love is like a breadbox" could be interesting if there was a good explanation for it.

Q. But how DO people assess poetry? ...


Andrea Valle is a semiotician.  This talk seems pretty exciting.  Task: you have to understand the data structures and algorithms inside a system that was written by an artist.  The artistic community of the time knew Balestrini's work.

"How did the available technology constrain and shape the work of the poet?"

"What is the relation between the poet and the machine in the creative process?"
the divine fury of the poet ... is converted into the infinie technical possibilities of the electronic instrument, elected both as the imaginative stimulus and as the practical manufacturer. - Sanguineti, 1965

poetry is an operation, the poet shows, precisely, how to act - Brancaleoni, 2007

Saturday, September 3, 2016

genetic method

“I take culture to be those webs, and the analysis of it to be therefore not an experimental science in search of law but an interpretative one in search of meaning. It is explication I am after, construing social expression on their surface enigmatical.” -Geertz
(Hm... perhaps Geertz is more exegetical than the social expressions themselves are enigmatical?)

In the mean time I've been looking at a couple of papers on Nietzsche and the genetic method.

D'Iorio, Paolo. "The Eternal Return: Genesis and Interpretation." Lexicon Philosophicum: International Journal for the History of Texts and Ideas 2 (2014).

I find the argument here really convincing.  D'Iorio does some impressive sleuth work, heading from library to library to look at Nietzsche's notebooks and his hand-annotated texts:


By contrast I'm somewhat less impressed with:

Paul de Man, Genesis and Genealogy in Nietzsche's The Birth of Tragedy Diacritics Vol. 2, No. 4 (Winter, 1972), pp. 44-53


De Man makes some really useful points that help to understand the "genetic pattern" employed in The Birth of Tragedy, but he seems to use Nietzsche's dramatic re-enactment of his themes -- e.g. leaving some things out of the final treatment and using a sort of grand rhetoric -- as a way to 'assassinate' the genetic model broadly speaking:


It's certainly interesting to think of the genetic model as a rhetorical mystification, but I'm not at all convinced that Nietzsche's use of it as such (if we're able to believe that) would invalidate, say, D'Iorio's use.  It does cast some doubt on de Man's logic, however.

A genetic metaphor (e.g. Dionysus 'giving birth' to Apollo in Nietzsche's framework) is one thing -- an interpretation (e.g. the "aberrant interpretation of Romanticism" that de Man criticizes) is another.



from: Self, Text, and Romantic Irony: The Example of Byron by Frederick Garber, p. 228

To be honest it seems to me that de Man is inventing a quite specific straw-man concept of "a genetic pattern based on the language of organic totalities" that Nietzsche, for his part, would gladly join him in rejecting -- as we can readily see by deciphering his handwriting and its context, indicated above.  Back to D'Iorio --


Following a clue in Garber's book, this points to a relevant quote from Jakobson:


from: Certain Difficulty of Being: Essays on the Quebec Novel by Anthony Purdy, p. 32

I think we can agree that the relevant relationships are not always contiguous ones.  This matter of "the realistic author [who] metonymically digresses from the plot to the atmosphere and from the characters to the setting in space and time" would seem to match Geertz's thick description reasonably well.

THAT SAID: 'description' as a method has been amply and ably criticized by Tim Ingold who suggests 'correspondence' instead. To me this seems very similar to the jump from Nietzsche to Simondon, i.e., from genesis to ontogenesis.

Simondon's philosophy feels very "slushy" -- it doesn't seem to believe in organic totalities at all, but it does believe in the existence of the world, and therefor the need to understand beings in context (including the context of their becoming).