Max de Marzi talks about Facebook Graph Search, how it works, and how you can apply similar mechanics to Neo4j


Facebook Graph Search has given the Graph Database community a simpler way to explain what it is we do and why it matters. I wanted to drive the point home by building a proof of concept of how you could do this with Neo4j. However, I don’t have six months or much experience with NLP (natural language processing). What I do have is Cypher. Cypher is Neo4j’s graph language and it makes it easy to express what we are looking for in the graph. I needed a way to take “natural language” and create Cypher from it. This was going to be a problem.

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

It’s an old programmer joke, but that is what came to mind. Some kind of fuzzy regular expressions. In the IPhone world, we usually hear people say “There’s an App for that”. In Ruby world, we go with “there’s a Gem for that”… so I asked google for some help and came upon Semr.

Semr is the gateway drug framework to supporting natural language processing in your application. It’s goal is to follow the 80/20 rule where 80% of what you want to express in a DSL is possible in familiar way to how developers normally solve solutions. (Note: There are other more flexible solutions but also come with a higher learing curve, i.e. like treetop)

Awesome, a ray of light to solve my problem… but the Gem is 4 years old. I could not get it to install. Bummer… Wait what was that about Treetop?

Treetop is a language for describing languages. Combining the elegance of Ruby with cutting-edge parsing expression grammars, it helps you analyze syntax with revolutionary ease.

Score! Now I had no idea how to write a proper language grammar, but that’s never stopped anyone before. Someone who has more than a couple hours of experience with Treetop is going to laugh at this but I’ll show you part of what I did:

Read the Full Article Here.