Formal English

Formal English is intended as a universal language for storage and exchange of data between computers and databases. Its application has a number of benefits:

  • It enables that software systems can interoperate without the need for converting data or for defining dedicated message structures and interface conventions.
  • Queries and messages between systems can be produced and interpreted by general purpose software.
  • Software that operates on Formal English (or other variants of the family) can be reused without modification and software can apply reasoning due to built-in logic relations.
  • The language is system independent and extendable and adaptive to specific business requirements or use of organization specific vocabularies.
  • The language is a formally defined structured and unambiguous subset and standardization of natural English. Its semantics is defined in the electronic English taxonomic dictionary-ontology, which includes computer interpretable knowledge, such as a consistent subtype-supertype hierarchy. Therefore it is called a formal language that is unambiguously interpretable by computers, while remaining readable and interpretable by humans.
  • It is multi-lingual because Formal English is a member of the Gellish family of formalized natural languages that share language independent unique identifiers of concepts and expressions. This implies that expressions in Formal English are computer translatable to every other language of the family for which a formal dictionary is available, such as Formal Dutch (‘Formeel Nederlands’). The language can even be applied in combination with non-Gellish dictionaries.
  • The application of the language is guided by a consistent rigorous methodology that is documented in the book ‘Semantic Information Modeling Methodology’ and is supported by a Wiki and by software for verification of the quality of expressions in Formal English and any language variant of the Gellish family.

Formal English consists of the following components:

  1. A Formal English grammar.
    The specification of the grammar describes how idea’s (facts, statements, questions and answers) can be expressed in Formal English.
  2. An Electronic English Taxonomic Dictionary and Ontology.
    The dictionary covers definitions of concepts and kinds of relations for nearly every application area. The dictionary contains a lot of specialized concepts, such as product types, characteristics and standardized property values which do not appear in conventional dictionaries. The dictionary is extensible by its users and enables addition of private synonyms and company specific terminology. The dictionary is subdivided in partial taxonomic dictionaries for various application area’s. For example: domain dictionaries for various technical and commercial disciplines as well as geography, biology, etc.
  3. A collection of kinds of contextual facts.
    Contextual facts describe the context of expressions. Such contextual facts are required for the unambiguous interpretation of the expressions. For example, the intention of an expression, its validity context, who and when it was expressed, etc.
  4. A standard format for expressions, called the Gellish Expression Format.
    A language independent specification of a universal expression format that defines the syntax of expressions and can be used for storage and exchange of data. It can be implemented in various data interchange formats, such as CSV and JSON or even in spreadsheet formats.
  5. A reference application.
    The reference application is a collection of Open Source software that is intended for verifying the consistency and correctness of the language definition. It has as further objectives: being an application for demonstrating the capabilities of applying the Gellish family of formalized languages (e.g. as a knowledge base, a data and document management system, or as a dictionary and model server) and as a source for reusable Gellish enabled software.

Further information is available on the Gellish website