Mark Needham goes into detail about creating a quick Neo4j graph database that would tell if there could be a substitute for a product for retailers


One of the interesting problems in the world of online shopping from the perspective of the retailer is working out whether there is a suitable substitute product if an ordered item isn’t currently in stock.

Since this problem brings together three types of data – order history, stock levels and products – it seems like it should be a nice fit for Neo4j so I ‘graphed up’ a quick example.

I wrote the following cypher to create some products, a person, a few orders and then the availability of those products in an imaginary store.

Although it’s not exactly the same as what I want to do I need to look into it more to see if some of the ideas can be applied.

I also learnt that the terminology for what I’m looking for is a ‘similar items‘ algorithm and I think what I’m looking to spike would be a hybrid recommender system which combines content similarity and user’s previous purchase history.

I’ve been looking around to see if there are any open or anonymised retail data sets to play around with but all I’ve come across is the ‘Frequent Itemset Mining Dataset Repository‘. Unfortunately when I tried to open the files they seem to just contain random numbers, so I must be doing something wrong.

Read the Full Article Here.