JSON User Guide

Introduction

This document inducts the user of Graph for Scala into how to export Graph instances to JSON-text and how to populate graphs from JSON text. Thus, it may be viewed as a supplement of the User Guide.

JSON texts may embed node/edge sections at any point. These sections must adhere to the Graph for Scala JSON Grammar to enable data retrieval. The Graph for Scala JSON Grammar, an extended JSON grammar, has been planned to be flexible in the following ways:

An arbitrary number of node/edge sections within the same JSON text will be processed to support different node end edge types within the same Graph instance.
JSON texts to be imported may include any non-graph related data which will be discarded.
All identifiers within the JSON text marking node/edge sections or node/edge types are configurable.
The user has full control over JSON formats representing nodes/edges.
The user has also fine-grained control over each phase of the import/export process.

With the exception of serializers, Graph for Scala JSON import/export is transparently implemented on top of Lift-Json.

Graph for Scala JSON is supplied as an extra module (jar). graph-json_XXX.jar depends on graph-core_XXX, lift-json_YYY and paranamer-ZZZ all of which must be available at run-time. For the latest release numbers see Version.scala.

Most examples in the following chapters are based on a partial ^[1] academic library application backed by a graph. In this library graph, books and authors are represented by nodes, authorship by edges:

// node types: Book, Author
sealed trait Library
case class Book  (val title: String, 
                  val isbn:  String) extends Library
case class Author(val surName:   String,
                  val firstName: String) extends Library

// node data: 2 books, 4 authors
val (programming, inDepth) = (
  Book("Programming in Scala", "978-0-9815316-2-5"),
  Book("Scala in Depth",       "978-1-9351827-0-2")
)
val (martin, lex, bill, josh) = (
  Author("Odersky", "Martin"),
  Author("Spoon", "Lex"),
  Author("Venners", "Bill"),
  Author("Suereth", "Joshua D.")
)

// graph with 2 authorships
val library = Graph[Library,HyperEdge](
  programming ~> martin ~> lex ~> bill,
  inDepth ~> josh
)

The example code is incorporated in TJsonDemo.scala.

Exporting graphs

To export a graph instance to JSON text you call toJson:

import scalax.collection.io.json._
val exported = library.toJson(descriptor)

Alternatively, you can control export phases one by one:

import scalax.collection.io.json.exp.Export
val export = new Export[N,E](library, descriptor)
import export._
val (nodesToExport, edgesToExport) = (jsonASTNodes, jsonASTEdges)
val astToExport = jsonAST(nodesToExport ++ edgesToExport)
val exported = jsonText(astToExport)

Clearly, exported of type String will contain the JSON text, but what about the descriptor argument?

Working with descriptors

Fine-grained control over JSON import/export is achieved by means of Graph JSON descriptors, a kind of export/import configuration made up of

node descriptors for each node type (see arguments defaultNodeDescriptorand namedNodeDescriptors)
edge descriptors for each edge type (see arguments defaultEdgeDescriptorand namedEdgeDescriptors) and
node/edge section identifiers (see argument sectionIds)

Prior to calling toJson you need to make some thoughts about what node/edge types your graph contains and how you want to serialize these in terms of Lift-Json serialization. In case of our academic library example you may start with

val bookDescriptor = new NodeDescriptor[Book](typeId = "Books") {
  def id(node: Any) = node match {
    case Book(_, isbn) => isbn
  }
}
val authorDescriptor = new NodeDescriptor[Author](typeId = "Authors"){
  def id(node: Any) = node match {
    case Author(surName, firstName) => "" + surName(0) + firstName(0) }
  }
import scalax.collection.io.json.descriptor.predefined.DiHyper
val quickJson = new Descriptor[Library](
  defaultNodeDescriptor = authorDescriptor,
  defaultEdgeDescriptor = DiHyper.descriptor[Library]()
)

First, we defined node descriptors for the node types Book and Author respectively where

the typeId argument is used to denote the node type in the JSON node sections like Books in

{"nodes":{
   "Books":[{"title":"Programming in Scala","isbn":"978-0-9815316-2-5"}, ...
]}}

and

id is responsible for generating a meaningful short cut for individual nodes to be inserted in JSON edges as a reference like "SJ" in
```
{"edges":{
   "DiEdge":[["978-1-9351827-0-2","SJ"], ...]
}}		
```
Without introducing short cuts for nodes, JSON edges would contain all node data meaning that, in proportion to the complexity of nodes and the order of the graph, JSON texts would explode in length.
Please exercise great care when designing the id method to ensure to return unique keys.

Thereafter, we assembled a Descriptor with the type argument Library and the constructor arguments authorDescriptor along with the predefined edge descriptor DiHyper. Predefined edge descriptors have a typeId equaling to their name and are type-safe with respect to the corresponding predefined edge types bearing the name of the edge descriptor suffixed with Edge, in our example DiHyperEdge. Predefined edge descriptors are merely short-cuts for individually configurable instances of EdgeDescriptor which we do not cover in this introductory.

At this point you’d like to inspect the resulting JSON text but, instead, you get a run-time exception teaching you that "No 'NodeDescriptor' capable of processing type "demo.Book" found". So it turns out that you did have reason for wondering about the completeness of quickJson. Indeed, Graph JSON descriptors must cover all node/edge types contained in your graph. If you really wanted a partial export you should filter your graph instance prior to exporting.

Learned this lesson, here is a complete descriptor sufficing our academic library graph (named arguments may be omitted – we verbose them just for better readability):

val descriptor = new Descriptor[Library](
  defaultNodeDescriptor = authorDescriptor,
  defaultEdgeDescriptor = DiHyper.descriptor[Library](),
  namedNodeDescriptors  = Seq(bookDescriptor),
  namedEdgeDescriptors  = Seq(Di.descriptor[Library]())
)

Passing the above descriptor to toJson yields the following afterwards prettified JSON text:

{
  "nodes":{
    "Books":[{
      "title":"Scala in Depth",
      "isbn":"978-1-9351827-0-2"
    },{
      "title":"Programming in Scala",
      "isbn":"978-0-9815316-2-5"
    }],
    "Authors":[{
      "surName":"Odersky",
      "firstName":"Martin"
    },{
      "surName":"Spoon",
      "firstName":"Lex"
    },{
      "surName":"Venners",
      "firstName":"Bill"
    },{
      "surName":"Suereth",
      "firstName":"Joshua D."
    }]
  },
  "edges":{
    "DiEdge":[{
      "n1":"978-1-9351827-0-2",
      "n2":"SJ"
    }],
    "DiHyperEdge":[{
      "nodeIds":["978-0-9815316-2-5","OM","SL","VB"]
    }]
  }
}

Let's analyze this JSON text in more detail:

You can easily identify the two node and two edge sections denoted by the field names "nodes" and "edges" respectively. These names are default names which may be altered by supplying a fifth argument to the constructor of Descriptor.

Our above JSON text may raise criticism in that it is polluted with the repeated field names "surName", "firstName" etc. You might be inclined to reject such a lengthy output. If so just opt for what we call positional JSON meaning that JSON values will be matched to node/edge class fields by their position. To let the export generate positional JSON requires a little bit of programming, however, namely the definition of appropriate Lift-Json custom serializers:

object PositionedNodeDescriptor {
  import net.liftweb.json._
  final class AuthorSerializer extends CustomSerializer[Author] ( fmts => ( 
    { case JArray(JString(surName) :: JString(firstName) :: Nil) => 
           Author(surName, firstName)
    },
    { case Author(surName, firstName) =>
           JArray(JString(surName) :: JString(firstName) :: Nil)
    }))
  val author = new NodeDescriptor[Author](
                   typeId            = "Authors",
                   customSerializers = Seq(new AuthorSerializer)){
    def id(node: Any) = node match {
      case Author(surName, firstName) => "" + surName(0) + firstName(0) }
    }
}

For each node type we need to extend net.liftweb.json.Serializer what is really straightforward. Then we pass an instance of the custom serializer AuthorSerializer to the node descriptor author. We have hidden implementation details by enveloping AuthorSerializer and the new NodeDescriptor author into the object PositionedNodeDescriptor which should also contain a custom serializer for Book (left out here).

Now we are ready to assemble a descriptor utilizing positioned JSON texts. As the Graph for Scala JSON package also contains predefined serializers for predefined edges we do not need to implement them separately:

import scalax.collection.io.json.serializer.{
         HyperEdgeSerializer, EdgeSerializer}
val descriptor = new Descriptor[Library](
  defaultNodeDescriptor = PositionedNodeDescriptor.author,
  defaultEdgeDescriptor = DiHyper.descriptor[Library](
                                Some(new HyperEdgeSerializer)),
  namedNodeDescriptors  = Seq(PositionedNodeDescriptor.book),
  namedEdgeDescriptors  = Seq(Di.descriptor[Library](
                                Some(new EdgeSerializer)))
)

Armed with the above descriptor we then call

val exported = library.toJson(descriptor)

and verify the resulting, "condensed" JSON text:

{
  "nodes":{
    "Books":[
      ["Scala in Depth","978-1-9351827-0-2"],
      ["Programming in Scala","978-0-9815316-2-5"]
    ],
    "Authors":[
      ["Odersky","Martin"],
      ["Spoon","Lex"],
      ["Venners","Bill"],
      ["Suereth","Joshua D."]
    ]
  },
  "edges":{
    "DiHyperEdge":[["978-0-9815316-2-5","OM","SL","VB"]],
    "DiEdge":[["978-1-9351827-0-2","SJ"]]
  }
}

Importing JSON texts

Being well versed in the design of Graph for Scala JSON descriptors, there is virtually nothing more left to learn to be able to populate Graph instances from JSON texts. To process JSON texts you call fromJson:

import scalax.collection.io.json._
val library = Graph.fromJson[Library,HyperEdge](jsonTextLibrary, descriptor)

library of type Graph [Library,HyperEdge] will contain all nodes/edges derived from the node/edge sections of the JSON text jsonTextLibrary. The descriptor argument will generally be the same value as used for the export unless you intend to alter node/edge types what would correspond to map a graph to another graph.

Note that the compiler can infer the type arguments but the result of this inference will be unsatisfactory so you are strongly advised to explicitly state the correct type arguments.

Alternatively, you can control import phases one by one:

import scalax.collection.io.json.imp.Parser._
val parsed = parse(jsonText, descriptor)
val result = Graph.fromJson[...](parsed)

Working with custom edge types

As in the following example, custom edge types must mix in Attributes and their companion objects must extend CEdgeCompanion to adhere to JSON descriptor requirements. Lets examine the custom edge type Transition that could serve as a transition between program states depending on keys. For the sake of simplicity we abstract away from the key modifiers Alt, Ctrl and Shift:

class Transition[N](from: N, to: N, val key: Char)
    extends DiEdge  [N](NodeProduct(from, to))
    with ExtendedKey[N]
    with EdgeCopy   [Transition]
    with EdgeIn     [N,Transition]
    with Attributes [N] {
  def keyAttributes = Seq(key)
  override protected def attributesToString = " (" + key + ")"

  type P = Transition.P
  override def attributes: P = new Tuple1(key)
  override def copy[NN](newNodes: Product): Transition[NN] = 
    Transition.newEdge[NN](newNodes, attributes)
}

object Transition extends CEdgeCompanion[Transition] {
  /** nodes are of type String. */
  def apply(from: String, to: String, key: Char) =
    new Transition[String](from, to, key)
  def unapply[N](e: Transition[String]): Option[(String,String,Char)] =
    if (e eq null) None
    else Some(e.from, e.to, e.key)

  type P = Tuple1[Char]
  override protected def newEdge[N](nodes: Product, attributes: P) =
    nodes match {
      case (from: N, to: N) =>
        new Transition[N](from, to, attributes._1)
    }
}

Most notably, attributes must be overridden by a Product containing all custom fields. The companion object must extend CEdgeCompanion and define newEdge.

Given the above definition of Transition we can instantiate a custom edge descriptor as follows:

new CEdgeDescriptor[String, Transition, Transition.type, Transition.P](
  edgeCompanion    = Transition,
  sampleAttributes = Tuple1('A'))

Note on inversion

val expLibrary = library.toJson(descriptor)
Graph.fromJson[Library,HyperEdge](
               expLibrary, descriptor) should equal (library)

Thinking of the JSON export as the inverse function of JSON import, the following rules apply:

Import(Export(graph)) == graph
as demonstrated above
Export(Import(JSON-text)) ≠ JSON-text
in most cases.

This relation should be obvious because a (JSON-)text is an ordered collection of characters while a graphs contains unordered sets of nodes and edges.

Grammar

nodeSection^0..*	::= JsonField( nodeSectionId : nodeValues )
nodeValues	::= nodeList \| JsonObject( nodeTypeId : nodeList )^0-1
nodeList	::= JsonArray( JsonObject( nodeFieldId : nodeField )^1..* )^0-1 \| JsonArray( JsonArray ( nodeField )1..* )^0-1
nodeField	::= JsonValue
edgeSection^0..*	::= JsonField( edgeSectionId : edgeValues )
edgeValues	::= edgeList \| JsonObject( edgeTypeId : edgeList )^0-1
edgeList	::= JsonArray( JsonObject( edgeIdFields )2..* )^0-1 \| JsonArray( JsonArray ( edgeFields )^2..* )^0-1
edgeIdFields	::= (edgeFieldId : edgeField)^1..*
edgeFields	::= (edgeField)^1..*
edgeField	::= JsonValue

Notes on the grammar notation

Entries with the prefix Json refer to JSON values as defined in RFC 4627. The parenthesis following such a Json entry are not part of the syntax. For instance,
JsonArray( JsonObject( edgeIdFields ))
reads "a JSON array containing JSON objects containing edgeIdFields".
If the multiplicity of a repetitive JSON element is restricted, the allowed multiplicity is given in superscript notation. For instance,
JsonObject( edgeTypeId : edgeList )^0-1 translates to
‘{‘ edgeTypeId ‘:‘ edgeList ‘}‘
with zero or one field in the JSON object. Thus it reads "a JSON object containing zero or one field".

Notes on specific grammar elements

nodeSection/edgeSection JSON fields:
The JSON text passed to the Graph conversion method fromJson will be parsed for an arbitrary number of nodeSections and edgeSections both described in the above grammar.
*Id JSON strings:
Grammar elements suffixed with Id such as nodeSectionId, nodeTypeId, edgeSectionId or gedgeTypeId are always JSON strings. In general, they allow using custom JSON constants.
For instance, JSON objects containing edges will be found in the JSON text based on edgeSectionId which defaults to "edges" but may be altered to any other name such as "vertices". Then, the caller of a Graph conversion method passes the appropriate value for edgeSectionId in the jsonDescriptor argument.
nodeTypeId/edgeTypeId JSON Strings:
These Ids provide a means to select the appropriate nod/edge descriptor.
nodeList/edgeList JSON arrays:
Nodes/edges enlisted in nodeList/edgeList may be represented either by JSON objects (named fields) or by JSON arrays (positioned field values).

1 We could also represent a complete academic library application by a graph containing different edge types for authorship, lectorship, etc.