Webinar presented by Mahesh Chaudhari, Sr. Software Engineer at Zephyr Health and speaker at upcoming GraphConnect SF

Z-Platform is the new innovative powerful and complex platform to ingest data of any kind and store the data in the form of JSON documents in MongoDB and represent a sparse representation of the same in Neo4j graph database. The platform has different services in the form of record linkage using exact and fuzzy match, address standardization and disambiguation, duplicate entity merging. The suite of applications, developed at Zephyr, heavily rely on the graph database for most part of the data. In the entire z-platform import process, creating nodes and relationships (edges) in the graph is one of the important process. The grouping of all the relationships and creating them at the end lead to deadlocks and hence lot of time was spent in the retry logic and hence building relationships became a time consuming process. In order to improve the performance of building the relationships, the approach presented in this presentation is detecting and avoiding deadlocks at run-time and creating non-conflicting batches of relationships. Yet, the process should be optimized to create large set of relationships in minimal amount of time. We have explored the traditional graph coloring algorithm to detect deadlocks and bipartite graph concept to create batches of relationships to avoid deadlocks. This approach has improved the performance of the system significantly. The test environment included small graphs (ranging up to 10000 relationships to very large graphs (ranging up to 39 million relationships). The average performance of the system is 3741 relationships per minute.

Speaker: Mahesh Chaudhari, Sr. Software Engineer, Zephyr Health Inc.
Dr. Mahesh Chaudhari is Sr. Software Engineer at Zephyr Health Inc. since September 2012. His primary responsibilities involve data modeling and integration, ontology development, algorithm design for the Z-Platform. He has a Ph.D. in Computer Science from Arizona State University where he has focused on incremental view maintenance of views defined over loosely-coupled heterogeneous data sources. He has extensive research experience in relational, object-relational, XML databases with query processing and optimization. He has one principal NSF grant at ASU since 2008 and one supplemental grant from NSF since 2011 supporting undergraduate research in database curriculum, benchmark and performance evaluation. He has 2 years of teaching experience and a recipient of PFF Emeriti Fellow for the academic year of 2009-2010 for excellence in research, teaching.

Watch the video here