Source code structure

BigConnect is hosted at GitHub and uses Git for source control. In order to obtain the source code, you must first install Git on your system. GitHub provides instructions for installing and setting up Git.

The source code is divided into multiple repositories that make up the platform as a whole:

  • bigconnect - contains the source code for the core of the platform and the graph engine

  • dw-plugins - contains various Data Workers that can be used for different purposes

  • bigconnect-web - contains the Web UI

The bigconnect repository

Contains the core of the platform and has all the code for the graph engine, data workers, long running processes, ontology and more.

Directory structure:

  • common - common code shared among the multiple BigConnect components

    • core - core BigConnect services for users, data workers, notifications, workspaces etc.

    • cypher - support for Cypher queries

    • elastic - Search Index implementation for ElasticSearch

    • graph - core graph engine API

    • security - shared files for the security model

    • sql - support for SQL queries

  • plugins - various plugins such as RocksDB storage, Spark integration, Proxy etc.

  • proxyserver-api - Java client driver

  • proxyserver - Java server

  • shell - Groovy shell

  • test - testing harness

The dw-plugins repository

Contains various required and optional Data Worker plugins:

  • audio-metadata - extracts audio metadata (size and duration) using ffprobe

  • audio-mp4-encoder - Encodes audio data into MP4 format using ffmpeg

  • audio-ogg-encoder - Encodes audio data into OGG format using ffmpeg

  • azure-image-ocr - Uses Azure Computer Vision API to extract text from image resources

  • azure-image-tags - Uses Azure Computer Vision API to extract tags from image resources

  • dictionary-extractor - Extracts terms from text using a dictionary

  • google-nlp - Uses Google Cloud Natural Language API to extract named entities

  • google-s2t - Uses Google Cloud Speech-to-Text API to extract audio transcripts

  • groovy - Runs Groovy scripts as data workers

  • image-metadata-extractor - Extracts image metadata using the Drewnoakes library

  • location-extractor - Extracts location from text using the CLAVIN Open Source Geotagger

  • mime-type-ontology-mapper - Maps MIME types to an ontology concepts

  • opencv-object-detector - Detects objects in images (like faces) using OpenCV

  • opennlp-me-extractor - Extracts terms from text using an OpenNLP maximum entropy

  • phone-number-extractor - Extracts phone numbers from text using libphonenumber

  • regex-extractor - Extracts terms from text based on Regex expressions

  • tika-mime-type - Uses Apache Tika to determine content MIME type

  • tika-text-extractor - Uses Apache Tika to extract text from various content types (pdf, doc etc.)

  • video-audio-extract - Extracts the audio stream from a video using ffmpeg

  • video-frame-extract - Extracts video frames for image processing using ffmpeg

  • video-metadata - extracts video metadata (geolocation, date, device, width etc.) with ffprobe

  • video-mp4-encoder - Encodes video into MP4 format with ffmpeg

  • video-poster-frame - Gets a video thumbnail by extracting a frame from the video using ffmpeg

  • video-webm-encoder - Encodes video into WebM format with ffmpeg

The bigconnect-web repository

Contains all the code for the web console and its plugins

  • config - configuration files for running the web console

  • plugins - bundled plugins for the web console

    • advanced-dictionary-manager - plugin for managing dictionaries for the dictionary-extractor data worker

    • auth-username-only - plugins for authentication using only the username

    • auth-username-password - plugins for authentication using username/password and forgot password.

    • graph-product - graph analysis plugin

    • map-product - spatial analysis plugin

    • rest-explorer - plugins for browsing and interacting with the REST API endpoints

    • terms-of-use - plugin for displaying and agreeing to initial terms & conditions

  • tomcat-server - contains all dependencies for running the embedded Tomcat server

  • war - source code for the web front-end (HTML, LESS, JavaScript)

  • web-base - source code for the web back-end