Two Altamira employees working on the development of Lumify recently gave presentations at a joint event co-hosted by the DC Natural Language Processing and Graph Database Baltimore-Washington meetup groups. These presentations featured Lumify's use of graph database technology to enable exploration and manipulation of NLP-enriched datasets.
Lumify is an open source platform for big data analysis and visualization built by Altamira engineers. It's designed to help organizations derive actionable insights from the large volumes of diverse data flowing through their enterprise. Utilizing both Hadoop and Storm, it ingests and integrates virtually any kind of data, from unstructured text documents and structured datasets, to images and video. Several open source analytic tools (including Tika, OpenNLP, CLAVIN, OpenCV, and ElasticSearch) are used to enrich the data, increase its discoverability, and automatically uncover hidden connections. All information is stored in a secure graph database implemented on top of Accumulo to support cell-level security of all data and metadata elements. A modern, browser-based user interface enables analysts to explore and manipulate their data, discovering subtle relationships and drawing critical new insights. In addition to full-text search, geospatial mapping, and multimedia processing, Lumify features a powerful graph visualization supporting sophisticated link analysis and complex knowledge representation.
Charlie Greenbacker, Altamira's Director of Data Science, provided an overview of Lumify and discussed how natural language processing (NLP) tools are used to enrich the text content of ingested data and automatically discover connections with other bits of information. The video from Charlie's talk is on YouTube, and the slides are available on SlideShare.
Joe Ferner, Senior Software Engineer at Altamira, described the creation of SecureGraph and how it supports authorizations, visibility strings, multivalued properties, and property metadata in a graph database. The slides and code samples from Joe's presentation are on GitHub, and the video is available on YouTube.
The crowd of about 100 NLP enthusiasts & graph database aficionados asked a ton of great questions and the technical discussions continued long after the presentations concluded. Many thanks to Neo4j for sponsoring the pizza & beer, and to Segue Technologies for providing the meeting space.