NOTES ON CONCEPTS OF LANGUAGES & GRAMMARS, MACHINES & AUTOMATA, COMPUTATION, THEIR QUANTUM ANALOGS AND RELATED BEASTIES The definitions here go pretty much in a logical order in that any definition depends on what has comes before; the concepts are also cross referenced by hyperlinks. Example and further elaboration is relegated to URLS and references, so this should be mostly understood as a simple, and possibly even a useful crib sheet of definitions and relationships.

**An Alphabet****A Language L**(spoken or written)**A Syntax or Grammar****A Generative Grammar****A Determinative Grammar****A Complete Generative Grammar****An Incomplete Generative Grammar****Semantics****A Formal Language****Operations On and Between formal languages****Formal Grammar****Lindenmayer Systems (L-Systems) Cf. also Wiki****Chomsky Hierarchy of Formal Grammars****Backus Normal Form, or Backus-Naur Form (BNF)****Extended Backus-Naur Form (EBNF):****A Complete Sequential Machine****Equivalence Classes of States****Transition Preserving Morphisms****Reduced Machine****Submachine****Irreducible Machines****Finite State Machine (FSA)****Automaton****A Nondeterministic Finite Automaton (NFA)****Deterministic Finite Automaton (DFA)****Pushdown Automaton****Turing Machine****Universal Turing Machine****Lambda Calculus of Recursive Functions****Linear Automaton****Stochastic Process****Markov Chain (denumerable)****Markov Process****Complex Markov Process****Stochastic Automaton, Markov Machine****Fuzzy Logic****Fuzzy Automaton****Lattice Automaton****Weighted Automaton****Homomorphisms of Automata****Automorphisms of Automata****Self Reproducing Automaton****Quantum Language****Quantum Grammar****Quantum Logic****Quantum Stochastic Process****Quantum Automaton****Finite Quantum Automaton****Finite Quantum Markov Process****Chu Spaces****Entropy****Information****Entropy-Information Ansatz****Computer****Quantum Computation****The Language of Quantum Field Theory**

An alphabetis a finite set of discriminatable and irreducible symbols that can be used either as such, or as representations of some other such set. This abstracts and refines the linguistical idea of alphabet where the symbols map to sounds in a most highly contextual way, most especially in English, which because of its multilingual basis is rather unphonetic in a strict sense. In spoken lnguage, the alphabetic elements are calledphonemes, while in written language they are calledlexemes. Both have a sense of irreducibility. Therefore, in English, consider "th" voiced and unvoiced as distinct lexemes. In Polish, consider "cz" and "sz" as alphabet elements, similarly in Hungarian, "s" and "sz" are distinct. This sense of alphabet would also include what a linguist would distinguish as a "syllabary" [Wikipedia], of symbols of the forms, C, V, CV, VC, and even possily VCV or CVC. The symbols would then most generally be called graphemes.A Language L(spoken or written) formally given, is a set S(A*) of finite length strings A* formed from some primary finite set A usually called the alphabet possessed of a syntactical (grammatical) and a semantical structure. S(A*) is a subset of the set A* of all possible strings formed from an alphabet A.A Syntax or Grammarfor a language L is a set of rules through which the strings of L may either be generated, or through which any element of A* can be determined to be an element of S(A*). These two types of grammars can be quite different, and inequivalent, so much so that where a grammar of the second type may exist, one of the first may not.A Generative Grammaris of the first type in the definition of grammar for language L.A Determinative Grammaris of the second type in the definition of grammar for language L.A Complete Generative Grammaris one that generates all the strings of S(A*) for language L.An Incomplete Generative Grammaris one that generates some of the strings of S(A*) for language L.Semanticsis the structure of a language L that gives it meaning relating it to its model referants; it is an interpretation of the language in terms of some model M involving interpretive maps from L to M as well as metaphorical maps within L itself. Though M is the universe of discourse for L, a unique M for any given L is unlikely. Among human languages, written Chinese and a period form of Babylonian are languages that require no sound value interpretation to be understood, similarly, mathematics and computer languages can be almost completely understood through their symbology alone.A Formal Languageis simply the subset S(A*) part (string part) of L with it's most general meaning. A formal language is a quadruple (N, T, P, S), with N - a finite set of nonterminal symbols. T - a finite set of terminal symbols that is disjoint from N. P - a finite set of production rules where a rule is of the form X → Y X,Y in (T union N)* with '*' the Kleene star [Wikipedia] and with the restriction that the left-hand side of a rule contains at least one nonterminal symbol. S - a symbol in N that is called the start symbol. For terminal symbols think "atomic" in the sense that they cannot be reduced further. If we were talking LISP, the terminals would include those symbols which evaluate (eval) to themselves: 't', 'nil', and any real or complex number; and the "list delimiter symbols" '(' and ')'. Sometimes, languages defined by strings are called "algebras of strings", but they do not satisfy the usual mathematical requirements of algebras: the only binary operation is the generally noncommutative "concatenation", the multiplication of the algebra. A formal language can, however, be extended to a genuine algebra over the Galois field G[3] by introducing an operation of "reversal", (-1)(abc) = (cba) and allowing sums of strings as elements of the algebra with coefficients in G[3], with S + (-1)S = (), the null string. This is a noncommutative associative algebra over G[3], without identity. You can also get more sophisticated with the field of the algebra, where it may start to look like an algebra that is a "quantum language". Some languages are group representations, and their algebras by such extensions become group algebras. The algebra of transformations of strings is the basis of representations of Markov processes, and a deep formal connection of these with quantum field theory expressed in perturbative form. If there are algebras of strings (one of which happens to be the metamathematical expression of formal mathematical logic), are there also algebras of higher dimensional objects transcending linear expressions of language? There is an algebra of Feynman graphs - and one might also say a pseudoalgebra of Chinese ideograms, a language in itself, divorced from spoken language as was also a certain level of ancient Babylonian cuneiform.Operations On and Between formal languagesThe only fundamental characteristic of a language generated by an alphabet A is that it is a subset of A*, therefore, Let L1 and L2 be languages over a common alphabet A, then: The set theoretic union of L1 and L2 is a language. The set theoretic intersection of L1 and L2 is a language. The set theoretic difference L1 - L2 is a language. The set theoretic symmetric difference (L1 - L2 union L2 - L1) is a language. If L2 is a subset of L1, define L2 as aSublanguageof L1. For every language L(A), there isComplement LanguageL^{c}(A) with respect to A* defined by L^{c}(A) := A* - L(A) The string concatenation of all strings xy with x in L1 and y in L2 is a language. This is a direct sum of languages, which happens to be equivalent in this case to a direct product of languages, as well as a tensor or Cartesian product of languages (L1, L2), or written simply as L1 L2. As an inverse to concatenation, a right quotient language L1/L2 exists whose elements are those strings x for which there is a y in L2 with xy in L1. A left quotient language can also be defined similarly. If L is a language, then L* is also a language; if L is a language of sentences, then L* is the language of all its literary works. If L is a language, the set of all its reversed strings, L^{r}is also a language. These do not exhaust at all the possibilities, they are simply some of the more interesting, fundamental and useful operations. The category of formal languages with common alphabet A is an involutive, commutative and associative ring with identity. Now consider languages with different alphabet, A and B. If the cardinalities |A| and |B| of A and B respectively are equal then A and B are isomorphic and so A* and B* are isomorphic. For unequal |A| and |B|, there are three nontrivial possibilities: 1. A is a subset of B, or B is a subset of A. 2. A and B have a common nonnull subalphabet. 3. A and B are disjoint. Suppose that B is a subalphabet (subset) of A. Then, for any L(A) there exists a language L(B) by process of deleting the elements of (A-B) from the strings of L(A). Then, call L(A) a Superlanguage of L(B). While the sublanguage so constructed is unique, L(B) can have many superlanguages. A superlanguage is a consistent extension of a language. If A and B have a common subalphabet, L(A) and L(B) have a common sublanguage L(A intersection B). If A and B are disjoint, L(A) and L(B) each contain a sublanguage, which are isomorphic to one another. --------------------------- The category of all formal languages is an involutive, commutative and associative ring with identity containing a nested family of subrings indexed by alphabets.Formal Grammaris a way to describe a formal language. In computer science, it is specifically a generative grammar, usually expressed in top down Backus Normal Form. The syntactically correct strings of the language are generated by free application of grammatical rules of substitution. A grammar for a language is not unique whether the language be formal or not. Assume a start symbol S analogous to an empty set, then the substitution or production rules generate new strings from known strings. Cf. Formal Grammar Wikipedia entry for formal definition and examples. ------------Lindenmayer Systems (L-Systems) Cf. also WikiThe production rules encapsulate the underlying processes that result in potential states. In their sequences of applications to produce and given string, the sequence is not necessarily unique; different sequences or paths though the string S(A*) may yield the same string. When different paths are possible, the grammar is called ambiguous.Chomsky Hierarchy of Formal GrammarsThe definition of formal languages is so general that the collection as a whole fails to exhibit what one might call "interesting structure". The very same thing happens in topology, where Hausdorff supplies a set of additional axioms of increasing stricture (and structure) to create topological spaces with more specific interest. Chomsky does likewise with formal languages:Type 0: Unrestricted Grammars are all formal grammars. They are exactly all those that can be accepted by some Turing machine.Type 1: Context Sensitive Grammars. These have rules of the form aAb → agb where to make the substitution, the preexistence of prefix 'a' and postfix 'b' is required; that is the context to which the rule is sensitive.Type 2: Context Free Grammars. These have production rules all of which do not have any of the contextual restrictions of type 1 grammars.Type 3: Regular Grammars are formal grammar (N, T, P, S) where all the production rules in P are of one of the forms: A → a A → aB or A → Ba A → e Regular languages are recognizable by finite state automata.Backus Normal Form, or Backus-Naur Form (BNF)is a collection of metasyntactical axioms for "top down" recursive schemes defining a formal grammar. As prototypical example of BNF, BNF is defined in BNF: NB: the notation is not the historically prescribed notation. grammar : grammar rule | rule rule : rule '|' formulation | NONTERMINAL ':' formulation | NONTERMINAL ':' formulation : formulation symbol | symbol symbol : NONTERMINAL | TERMINALExtended Backus-Naur Form (EBNF):with [] = optional, {} = multiplicity including 0, () = precedence grammar : rule { rule } rule : NONTERMINAL ':' [ formulation ] { '|' formulation } formulation : symbol { symbol } symbol : NONTERMINAL | TERMINAL | '{' formulation '}' | '[' formulation ']' | '(' formulation ')'A Complete Sequential Machineis a sextuplet (Q, q0 I, O, i, o), where Q is the set of machine states q0 is an initial state, a member of Q I is the set of input symbols (input alphabet) O is the set of output symbols (output alphabet) i is the input function mapping Q x I *into* Q or (transition function) o is the output function mapping Q x I *into* O i: (Q x I) -→ Q o: (Q x I) -→ O i * o : ((Q x I) x I) -→ O The completeness is that the domain of i is the complete Cartesian product Q x I. For an incomplete machine, there may be i(q, a), for s in Q and a in I that are not defined.Equivalence Classes of StatesTransition Preserving MorphismsReduced MachineSubmachineIrreducible Machinesare those which have no submachines.Finite State Machine (FSA)is a sequential machine with a Q of finite cardinality. Acceptors - answer yes or no to their input. Recognizers - classify their input. Transducers - translate their input, as one language to another.AutomatonAn automaton is the heart of a sequential machine, that which deals only with inputs and states. Unless otherwise noted, the automata considered here will be finite. They can profitably be classified as either nondeterministic or deterministic. NB: It is important to notice that a nondeterministic machine can exist in several states simultaneously while a deterministic machine exists only in one state at a time. The transitions for a nondeterministic machine are not a matter of competing alternatives, as they are for a quantum automaton. It is in this sense that a nondeterministic automaton can be represented by a simple statisticalized collection of deterministic automata.A Nondeterministic Finite Automaton (NFA)There can be several *simultaneous* transitions for each possible input. These can be implemented with larger deterministic automata, at worst of exponentially larger size. It is defined as a sextuplet (Q, q0 I, O, d, f), where Q is the set of machine states q0 is an initial state, a member of Q I is the set of input symbols (input alphabet) V is the valuation set {0, 1} d is the transition function governming whether there is an exisiting transition between q and q', elements of Q, and so is a mapping d: Q x I x Q → V f is the final state determination function f: Q → V Again, if Q is finite, the automaton is finite.Deterministic Finite Automaton (DFA)As usually defined these are a special kind of NFA. There is only one possible transition for each possible input. It is defined as a sextuplet as for a nondeterministic automaton with the condition that for any q in Q and a in I, there is a unique q' such that d(q, a, q') = 1Pushdown Automatonis an automaton equipped with a potentially infinite amount of memory in a (LIFO) stack. Every pushdown automaton accepts a formal language. The languages accepted by nondeterministic pushdown automata are precisely the formal languages with context free grammars. Compilers for computer languages are approximations to an ideal finite state pushdown automaton, a foible being that they can "blow their stacks".Turing MachineUniversal Turing MachineLambda Calculus of Recursive FunctionsLinear AutomatonStochastic Processis a random process that takes place over time which can be considered as given by X(t), a family of random variables parameterized by t. While t is often taken as a continuous real parameter, for purposes here it will be taken as uniformly discrete, or it may take on values in some discrete set. The idea ofstochasticity, breezily stated, is that the random variables X(t) and X(t') are not correlated. Prototypically, think of infinite sequences of coin tosses where the random variables all have values in the set {H, T} and are governed by a binomial distribution; the result of any one toss is not dependent on, or correlated to any other toss.Markov Chain (denumerable)A sequence of random variables (Xn : n = 0, 1, 2, ...} is called a Markov chain if for every finite collection of integers n1 < n2 < n3 < ... < nr < n, the conditional probability distributions, Pr{ Xn | Xn1, Xn2, Xn3, ... Xnr } = Pr{ Xn | Xnr } The chain is called aDenumerable Markov Chainor aFinite Markov Chainaccording on the cardinality of the set of possible values of the random variables Xn.Markov ProcessA Markov process is a stochastic process exhibited by a Markov chain that lends itself to the description of a stochastically sequential machine or an automaton because of its sequential assumption: the state of the system s(k) depends only on its predecessor s(k-1). When this condition is moved via some limiting process from a discrete environment to a continuous environment, it becomes the condition that the "law of motion". i.e., the law which predicts future states from a current state or a past state is a differential equation. Not all physical processes are of this nature. E.g., in rheology, the theory of deformations of continuous, plastically deformable substances, the fundamental equation is an integrodifferential equation because the current state depends on the entire history of the deformations which is summed up in an integral with a variable limit of integration. This turns up in a realistic treatment of a spring which takes into account the limit of stretching noticed by Young beyond which stretching becomes irreversible. In a Markov process for a sequential machine, the machine output becomes in essence the current state which as the next input results in the successor state. All that is required to make the machine run is an initial input.Complex Markov ProcessStochastic Automatonhas the same definitions as for a nondeterministic automaton, NFA except that for d(q, a, q'), we say instead that any given q and a, that Sum d(q, a, q') = 1 q' the sum being taken for all q in Q. Now, however, a determination of a future state becomes more problematic since the history of the transitions comes into play, and the waiting time them matters. Consider an extension of d(), d'() defined for a sequence of n inputs (a) := (a1, a2, ... an), with (q) = (q1, q2, ..., qn) a vector of "dummy variables of summation", each qk ranging over Q, d'(q, (a), q') := Sum d(q, a1, q1) d(q1, a2, q2) d(q2, a3, q3) ... d(qn, an, q') (q) Defining a matrix D(a1) with elements d(q, a1, q1), q indexing rows and q1 indexing columns, the above sum of products is seen to be a matrix product, so that for the similarly define d'(), a function of the (time ordered) sequence of inputs (a), D'((a)) := D(a1) D(a2) ... D(an) The elements D'(q, (a), q') of D'((a)) now give the probabilities after the input sequence that the state is q' if the automaton began in state q. Notice that the essential condition on d() says that the sums of the rows of the matricies D equals 1, which is the condition for D to be astochastic matrix. One might also call a stochastic automaton aMarkov Machinesince it models a Markov process. As the transition matrix D(a) of a stochastic process can beDoubly Stochastic, i.e., in addition, the columns sum to one, there are alsoDoubly Stochastic Automatawith the additional condition that Sum d(q, a, q') = 1 qFuzzy LogicThere is an intimate relationship between formal set theory and formal logic, an isomorphism in fact. While in formal set theory, the membership relation x is a member of X is a logical proposition with truth values in the diploid set {0, 1}, in fuzzy set theory, the truth values lie in the real interval [0, 1] which is then associated with a probability measure. From this assumption on "truth valuation" all Fuzzy logic flows. E.g., if A and B are two disjoint sets, x in A with prob. p1 AND x in B with prob. p2 implies that x in A union B with prob. p1 + p2. Since formal logic/formal set theory is the foundation of all mathematics, this "fuzzification" is extendible to the entire to the entire body of mathematics. "Fuzzification" is simply a "statisticalization"; quantization is more sophisticated. The word "quantization" is a fuzzy thing in itself that depends for its proper understanding, a context that can vary quite widely in the realms of both mathematics and physics.Fuzzy Automatonis similar to a stochastic automaton. Where stochastic automata generalize the DFA, fuzzy automata generalize the NFA. Formally, a fuzzy automaton is a sextuple (Q, q0 I, O, d, f), where Q is the set of machine states q0 is an initial state, a member of Q I is the set of input symbols (input alphabet) V is the valuation set {0, 1} d is the transition function governing whether there is an exisiting transition between q and q', elements of Q, and so is a mapping d: Q x I x Q → V f is the final state determination function f: Q → V Again, if Q is finite, the automaton is finite. ------------------ Sup and Inf stuff --------------------------Lattice AutomatonWeighted AutomatonHomomorphisms of AutomataAutomorphisms of AutomataSelf Reproducing AutomatonQuantum AutomatonQuantum LanguageQuantum GrammarQuantum Logicorthomodular latticeQuantum Stochastic ProcessQuantum AutomataFinite Quantum AutomataFinite Quantum Markov ProcessChu SpacesChu GuideEntropyFirst arises in the context of classical thermodynamics and the Carnot engine (an ideal heat pump) where it is defined as an integral over a cycle of the Carnot engine, S := I dQ/T where dQ is an infinitesimal amount of heat transfered at temperature T. Its fundamental relation, derived from the first law of thermodynamics (an expression of the conservation of energy) is, ++++++++++++++++++++++++++++++++ The original concept of entropy was a heat that was unavailable to do work, an idea that was apparently first understood by Carnot, who also gave the second law of thermodynamics as a "principle" of the subject. Reinterpreting the idea and the thermodynamic variable in terms of the statistics of atomic theory, and equating the old variable with the statistically derived quantity by Ludwig Boltzmann was quite a transcendental leap that both clarified the concept and complicated the second law, and its legitimate applications. In preatomic thermodynamics, it was understood that the second law applied to closed systems. This remains from the statistical mechanical point of view, but the added understandings are that it applies also only over long times; furthermore, entropy now allows statistical fluctuations that are unknown in classical thermodynamics. With the advent of statistical mechanics based on atomic theory, the statistics of the classical mechanics of atoms is to "explain" thermodynamics, and in this case, the functional of a probability distribution {p_{k}} usually taken to express the entropy is S({p_{k}}) := - Sum p_{k}ln( p_{k}) k This is not the only expression for entropy, just the first.InformationFirst defined by Shannon, independently of Boltzmann's work, S({p_{k}}) := + Sum p_{k}log( p_{k}) k This is not the only expression for information, just the first. The logrithm base often used in 2. It is defined as a functional of a proability distribution, with obeisance to the abstract fundamental bit encoding of abstract computers. The mathematical sense of "information" is not about *information* in the ill defined common sense. The Shannon type functional is an intrinsic definition that transcends language and semantics, and has to do with "information to a possible decoding machine - and I do mean machine like a Turing machine" reading explicitly from a string of symbols. Our garden variety concept of information is more complicated since it depends on semantics and extended metaphors. The mathematical definition avoids all that and concentrates on the statistics of the symbolic stream: an 'e' is the most prevalent letter in English text (and also probably French text, I never saw the numbers for that and don't remember if I did), therefore, it carries the least information: distinguish all English words that have 'e' in them. A rare letter like 'z' carries more determining weight (and information): distinguish all English words that have 'z' in them. Letter 'z', however is more plentiful in Hungarian. No, I haven't done or looked up the statistics, I go on what I learned of Hungarian, and partially in that "az" is the Hungarian "the". The flatter the probability distribution, the less info any alphabetic symbol has, the less information that symbol can be said to carry, and the more entropy (disorder) associated with the probability distribution.Entropy-Information AnsatzStructure and order of energy is a measure of information, while entropy is a measure of energetic disorder. Historically, the thermodynamic concept of entropy came first, but almost since Shannon's first papers, the idea that the two concepts are intimately related arose. Strangely, the semantics of their relationship as negatives of each other has only recently been proposed, particularly in the understanding of the entropy of black holes in general relativity, but before that in analyzing why the Maxwell Demon of statistical mechanics ultimately fails. Increasing the entropy of any closed physical system decreases its information content by an equivalent amount. In a sense, looking at entropy and information in the context of physical systems can be seen as different interpretations of the same thing. Remember though that Shannon's original idea has to do with strings of alphabetic symbols sent and received over a "channel" that as a process had probabilities attached to the symbols. Out of this comes coding theory, and selfcorrecting codes suitable for transmission through noisey channels.ComputerNeumann Models Turing Models Alonzo Church ModelsQuantum Computation(from Feynman's suggestion in 1982, that the real world actually operates as a vast computer) A situation where it is possible that the physical theory which may be useful to understand how things happen, may refuse to extend itself to useful engineering applications for the purposes of control. In the history of science and engineering, this may be the first instance of where government cannot use scientific developments for their all too usual purposes of control and genocide. This does not mean that governments will not create nonscience to to excuse their own rather insane and irrational lusts; acting on their own lusts is, after all, what governments do, and so long as they exist, there is no stopping them from their attempted mendacious and genocidal intents. Realities never disuade them from their primitive obsessions.The Language of Quantum Field TheoryTOC

Footnotes 1. 2. 3.

Top of Page (TOC)Email me, Bill Hammel at

bhammel@graham.main.nc.us

READ WARNING BEFORE SENDING E-MAIL

COPYRIGHT NOTICE REGARDING ALL ORIGINAL WORK HEREIN: © August 2003 by Bill Hammel (bhammel@graham.main.nc.us). Permission to use for any noncommercial, educational purpose. This copyright and permission notice must appear in all copies. Permission is also granted to refer to or describe these documents in commercial books, products, or online services. These documents may be freely reproduced, copied and disseminated by any electronic, digital or written means, but in no case may such copying or dissemination be charged for. The idea is very simple, no person or body has supported any of the original works contained in this pages. They are works of love given freely. I find repugnant the idea of someone expropriating, for profit, what I give freely. If you have a problem with this, ask; rules always have exceptions.

The URL for this document is:

http://graham.main.nc.us/~bhammel/PHYS/autom.html

Created: June 3, 2004

Last Updated: October 24, 2005

Last Updated: January 25, 2006