Software Structure Analysis and Metric Calculation with Neo4J and Cypher

During the last weeks the Software Analytics and Evolution research team at the Software Competence Center Hagenberg (which is the group i am actually working in) built a software tool for parsing large scale legacy software systems, such as C, C++ but also FORTRAN, Structured Text (IEC 61131 Machinery and Robot programs) or Matlab source code with the goal to analyse its structure by using a Neo4J graph database and Cypher queries. As you can see within this demo video, the tool is able to visualize important aspects and metrics as well as the software architecture and structure of the analyzed software system. The tool is meant for supporting companies to develop and maintain their large software systems and code bases.

The Database Decision – a hard choice for startups

File:GraphDatabase PropertyGraph.png

Did you ever wonder how or why startups choose a very specific kind of database for implementing their solution? According to ‘the-software-behind-facebook‘ Facebook for example uses a multitude of different database technologies to fulfill their immense scalability requirements. After 30 years of relational database technology domination, at last some other very interesting database, persistance, caching and query approaches appeared on the market. Most of the time these approaches are distinguised into the categories RDBMS – also know as Polyglot Persistence and NOSQL, while NOSQL is the overall term for a really wide spectrum of approaches and technologies such as:

  1. Key-Value-stores
  2. BigTable-implementations
  3. Document-stores
  4. Graph Databases
For Key/Value systems like Voldemort or Tokyo Cabinet the smallest modeling unit is the key-value-pair. On the other side BigTable – clones it is tuples with variable numbers of attributes. In document databases like CouchDB and MongoDB the document is the smallest modeling unit of data. Graph Databases model the whole dataset as one big dense network structure, such as Neo4J.
Social applications such as Facebook and others often use graph databases according to the really bad query performance of recursive structures like for instance file trees and network structures like e.g. social graphs, on relational databases.
Another aspect to consider is whether a database offers convenient spacial query operations, in order to speed up the lookup of specific locations or areas. Many of the modern mobile apps use these spatial queries to locate places, things, events or friends. Foursquare is one of the most prominent examples for a location based social application.
A very good article by Peter Neubauer, the COO of Neo Technology, explains where the differences of Relational, NOSQL and GraphDBs are, which is definitely worth reading!

 File:Anchor Modeling Example.svg